Psychometrika 


VOLUME XV —1950 
JANUARY-DECEMBER 





Editorial Council 


Chairman:—L. L. THURSTONE Managing Editor:— 


Editors: —A. K, Kurtz DoroTHY C. ADKINS 
M. W. RICHARDSON Assistant Managing Editor:— 
SAMUEL B. LYERLY 


Editorial Board 


R. L. ANDERSON CHARLES M. HARSH M. W. RICHARDSON 
H. S. CONRAD PAUL HORST P. J. RULON 
ELMER A. CULLER ALSTON S. HOUSEHOLDER WM. STEPHENSON 
E. E. CURETON TRUMAN L. KELLEY GODFREY THOMSON 
Max D. ENGELHART ALBERT K. KURTZ L. L. THURSTONE 
HENRY E. GARRETT IRVING LORGE LEDYARD TUCKER 
J. P. GUILFORD QUINN MCNEMAR S. S. WILKS 
HAROLD GULLIKSEN CHARLES I. MOSIER HERBERT WOODROW 

FREDERICK MOSTELLER 

GEORGE E. NICHOLSON 





PUBLISHED QUARTERLY 


By THE PSYCHOMETRIC SOCIETY 
AT 23 WEST COLORADO AVENUE 
COLORADO SPRINGS, COLORADO 








Psychometrik 











CONTENTS 


SEQUENTIAL SAMPLING PLANS FOR USE IN PSYCHO- 
LOGICAL TEST WORK - - - - - - = = 


ALLYN W. KIMBALL 
THE CALCULATION OF SUMS OF SQUARES FOR INTER- 
ACTIONS IN THE ANALYSIS OF VARIANCE - 
ALLEN L. EDWARDS AND PAUL HORST 
A FACTORIAL STUDY OF THE MULTIPHASIC, STRONG, 


KUDER, AND BELL INVENTORIES USING A 
POPULATION OF ADULT MALES - - - - - 


WM. C. COTTLE 


A DEVICE FOR FACILITATING THE COMPUTATION OF 
THE FIRST FOUR MOMENTS ABOUT THE MEAN 


JOHN LYMAN AND PIETRO V. MARCHETTI 


A NOTE ON THE CALCULATION OF WEIGHTS FOR 
MAXIMUM BATTERY RELIABILITY - - - - 


BERT F. GREEN, JR. 


A NOTE ON THE MEASUREMENT OF REVERSALS OF 
PERSPECTIVE - - - - - - 

J. S. BRUNER, L. POSTMAN, AND F. MOSTELLER 

S. S. WILKS. Elementary Statistical Analysis - - - - - 
A Review by FREDERICK MOSTELLER 

TRUMAN LEE KELLEY. Fundamentals of Statistics - - 
A Review by J. P. GUILFORD 

N. RASHEVSKY. Mathematical Biophysics - - - - - 
A Review by JOHN M. REINER 


ROBERT L. THORNDIKE. Personnel Selection: Test and 
Measurement Techniques - - - - - = = = 


BOOKS RECEIVED - - - - - = = = = 


17 


25 


49 


57 


63 


73 


76 


79 


83 
89 








VOLUME FIFTEEN MARCH 1950 NUMBER 




















MAK 8 1350 


sychometriké 
A JOURNAL DEVOTED TO THE DEVE 


OPMENT OF PSYCHOLOGY-AS 
QUANTITATIVE RATIONAL SCIENG 


















































THE PSYCHOMETRIC SOCIETY + ORGANIZED IN 15 








VOLUME 15 
NUMBER 14 


ARCH 
1950 











PSYCHOMETRIKA, the official journal of the Psychometric Society, is devoted to 
the development of psychology as a quantitative rational science. Issued four 
times a year, on March 15, June 15, September 15, and December 15 


MarcH 1950, VOLUME 15, NUMBER 1 


Printed for the Psychometiic Society at 28 West Colorado Avenue, Colorado 
Springs, Colorado. Entered as second class matter, September 17, 1940, at the 
Post Office of Colorado Springs, Colorado, under the act of March 8, 1879. Edi- 
torial Office, Educational Testing Service, Princeton, New Jersey. 


Subscription Price: The regular subscription rate is $10.00 per volume, The sub- 
seriber receives each issue as it comes out, and a second complete set for binding 
at the end of the year. All annual subscriptions start with the March issue and . 
cover the calendar year. All back issues are available. The price is $1.25 per 
issue or $5.00 per volume (one set only). Members of the Psychometric Society 
pay annual dues of $5.00, of which $4.50 is in payment of a subscription to 
Psychometrika. Student members of the J ‘chometric Society pay annual dues 
of $2.00,.of which $2.70 is in payment for: durnal. 


Application for membership and student membership in the Psychometric Socicty, 
together with a check for dues for the calendar year in which application is 
made, should be sent to 


T. GAYLORD ANDREWS 

Chairman of the Membership Committee 
Department of Psychology 

The University of Maryland 

College Park, Maryland 


Payments: All bills and orders are payable in advance. Checks covering mem- 
bership dues should be made payable to the Psychometric Society. Checks cover- 
ing regular subscription to Psychometrika and back issue orders should be made 
payable to the Psychometric Corporation. All checks, notices of change of ad- 
dress, and business communications should be addressed to 


Rospert L. THORNDIKE 

Treasurer, Psychometric Society and Psychometric Corporation 
Teachers Ccllege, Columbia University 

New York 27, New York 


Articles on the following subjects are published in Psychometrika: 


(1) the development of quantitative rationale for the solution of psychologi- 
cal problems; 
{Z) genera] theoretical articles on quantitative methodology’in the social and 


biological sciences; 
(3) new mathematical and statistical techniques for the evaluation of psy- 


chological data; 

(4) aids in the application of statistical techniques, such as nomographs, 
tables, work-sheet layouts, forms, and apparatus; 

(5) critiques or reviews of significant studies involving the use of quantita- 
tive techniques. 


The emphasis is to be placed on articles of type (1), in so far as articles of this 
type are available. 
(Continued.on the back inside cover page) 















































in © asi fale —— a Ns MES i aR le Pg ue. 
othe 
| 7 
: ; | 
praerrneras i < ‘ ee nt | 
~ ene ty ry has x tag ‘a ate pte hee Sg : ap i 7 : 7 . 
| | . | : i : 
a7 : 7 ; 
. 2 ee | | = | 
e, | a 
a - . 7 ; x . ‘ ~~ 
7 : ; | . | 
er = i. bo : - | | 7 | : 
ae : | | . 
eee zi - 7 . e : Pe Re 53 a ¥ a . : 
mks wor 2 w. 7 ' | 4 | | 
bas } Ba <0 ’ SS ie ae 














PSYCHOMETRIKA—VOL. 15, NO. 1 
MARCH, 1950 


SEQUENTIAL SAMPLING PLANS FOR USE IN 
PSYCHOLOGICAL TEST WORK* 


ALLYN W. KIMBALL 


USAF SCHOOL OF AVIATION MEDICINE 
RANDOLPH FIELD, TEXAS 


In large-scale psychological testing programs, it happens fre- 
quently that a group of items must be checked for accuracy with 
respect to certain characteristics. In many cases it is not feasible to 
check each item in the group separately, so that a decision concern- 
ing the entire group is usually reached by examining in detail only 
a small portion of the group. This paper proposes some sequential 
sampling plans for use in this type of work. A general discussion 
of the problem is followed by an illustrative example of the method 
applied to checking scores on groups of psychological test papers. 


1. Introduction 


In psychological testing programs, situations often arise in which 


a group of items must be checked for accuracy with respect to one or 
more characteristics. In many cases a high degree of accuracy is 
required, but the time and effort necessary to check each item sepa- 
rately are prohibitive. Such circumstances occur, for example, when 
a large group of test papers is submitted for rescoring. Similar prob- 
lems are encountered when a large deck of punched cards must be 
checked for agreement with coded data. In general, the problem of 
checking transcription of items is one which also falls in this cate- 
gory. Under the assumption that complete reprocessing of a group 
of items is not practicable, this paper presents a method of checking 
which achieves a balance between accuracy of results and expendi- 
ture of effort. In the text which follows, the problem of verifying 
scores on test papers is discussed in detail. The method may be ap- 
plied to other problems in psychological test work such as those men- 
tioned above with very little alteration. 


2. Choice of a Method 


When large groups of individuals are involved in psychological 
testing programs, test papers are scored manually with scoring keys 
and automatically by IBM Test Scoring Machines. In either case er- 

*The stimulus for this paper was provided by Dr. S. B. Sells. 
1 
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rors in scoring may result. For most purposes, highly accurate scor- 
ing is required, so that it is desirable that each group of test papers 
be rescored and checked. The tolerance concerning degree of accu- 
racy per individual score and the percentage of error in the group 
must be determined by the test administrator in accordance with the 
requirements of the program. The degree of accuracy to be main- 
tained must in practice be balanced against the cost of achieving it. 

One method which is complete but not infallible is to rescore every 
paper in the group and to accept the scores on those papers for which 
the two scorings agree. Papers whose scorings do not agree are re- 
scored repeatedly until agreement is reached between at least two 
scorings, and this score is taken as correct. This method is not gen- 
erally acceptable because it is laborious and costly. The alternative 
methods propose rescoring only a portion of the group and either ac- 
cepting or rejecting the scores for the entire group on the basis of 
the number of agreements in the portion examined. Groups which 
are rejected under this criterion are rescored completely. In the term- 
inology of the industrial statistician, this is equivalent to taking a 
sample from a lot and accepting or rejecting the lot on the basis of 
an examination of the sample. Obviously, in any such procedure it 
is possible to make errors. It is possible to reject a lot which should 
be accepted and to accept a lot which should be rejected. However, 
the probabilities of making such errors can be specified in the design 
of the sampling plan. In this case, the experimenter knows that if he 
uses a particular sampling plan repeatedly under similar conditions, 
he will make errors of the kind described above with a frequency not 
greater than that indicated by the probabilities he specifies. Certain 
procedures now in use provide for the checking and rescoring of, say, 
every tenth paper, with a decision to reject the lot (and check every 
paper in the lot) only when more than a fixed number of incorrectly 
marked papers are observed. Some experimenters who have applied 
this technique to specific situations for long periods of time have a 
good idea of its reliability and achieve success in using it. In general, 
however, such experience is lacking, and recourse must be had to 
more exact methods which provide estimates of the risks involved in 
using them. 

The sequential method of sampling was chosen instead of single, 
double, or multiple sampling methods for two reasons. First of all, 
it requires less sampling, on the average, than any of the alternative 
methods. This is the outstanding advantage of the sequential method 
and is one which satisfies the requirement of a minimum amount of 
rescoring. Secondly, it incorporates the probabilities of both kinds of 
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error into its acceptance and rejection criteria. Many of the single, 
double, and multiple sampling plans available do not fix the prob- 
ability of the second kind of error, i.e., the probability that a lot is 
accepted when in fact it should be rejected. Some sampling plans fix 
the probability of the first kind of error and the average outgoing 
quality limit (AOQL) which is defined in section 4. For the purpose 
of checking scoring on test papers, the second kind of error is more 
important than the first kind of error, in the sense that it is more 
important to avoid making it. Specifically, the experimenter may ob- 
ject to rescoring every paper in a group which was rejected when it 
shoud have been accepted, but he wiil object much more strenuously 
to accepting a group which contains so many incorrectly marked pa- 
pers that it should have been rejected. Since the sequential method 
is superior with respect to both of these important requirements, it 
was chosen to determine the sampling plans. 


3. Description of the Sequential Method 


A group of scored papers is presented for rescoring and checking, 
It is assumed that the group is homogeneous in the sense that each 
test paper had the same chance of being scored correctly. The defini- 
tion of a correctly scored paper is left to the individual test adminis- 
trator. He may require that the original and check scores agree per- 
fectly, or he may allow deviations of 1% or even 2% in some cases. 
That is, if scorings are made on a percentage basis, he may be will- 
ing to classify as acceptable a paper whose original and check scores 
differ by 1% or 2%. Whatever decision is made, it must be adhered 
to during the application of the method to a single group of papers. 
Papers are chosen at random from the group one at a time* and re- 
scored. Each paper is classified as correctly or incorrectly scored. 
After the first paper has been classified, the scorer refers to a table 
and makes one of three decisions: 


(1) Accept the group. 

(2) Reject the group. 

(8) Rescore another paper. 
If his decision is (1) or (2), the sampling stops. If his decision 
is (3), he rescores another paper, refers again to the table, and makes 
one of the three decisions. He continues this process until the group 
of papers is either accepted or rejected. It should be noted that the 
sample size is not fixed in advance but is determined by the method 


*It may be more practicable to select a large number of papers at random 
and have them available for the person who is doing the rescoring. 
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itself and will vary from one group to the next. 

In the preparation of the table to which the scorer refers, cer- 
tain quantities are employed which must be specified by the test ad- 
ministrator. First of all, he must state a proportion p, of incorrectly 
marked papers which he will be willing to accept. Obviously p. can- 
not be zero, since then every paper in the group would have to be 
inspected. He must also state a proportion p, = p, of incorrectly 
marked papers which he wants definitely to reject. Secondly, he must 
agree to accept a probability a that groups containing a proportion 
?, of incorrectly marked papers will be rejected, and a probability 8 
that groups containing a proportion p, of incorrectly marked papers 
will be accepted. In other words he must agree to take a chance a of 
rejecting a group of quality p, which is acceptable, and a chance f 
of accepting a group of quality », which is not acceptable. These are 
the probabilities of error referred to in the last section. The average 
amount of sampling required to reach a decision varies with a, 6, 
~., and p,. The values of these quantities employed in the next sec- 
tion to determine various sampling plans were chosen as representa- 
tive of those which might be desired in psychological testing pro- 
grams. Occasionally, compensating adjustments may be necessary to 
achieve average sample sizes at a prescribed level. For a thorough 
discussion of this problem, the reader is referred to (2), section 2.14. 


4. Sampling Plans 


Once the quantities a, 8, p., and p, have been chosen, the sam- 
pling plan is completely determined. Four combinations of these val- 
ues have been selected and the corresponding sampling plans are pre- 
sented in detail in Tables 1-4 and in Figures 1-4. Each plan consists 
of a reference table, an operating characteristic (OC) curve, an aver- 
age sample number (ASN) curve and an average outgoing quality 
(AOQ) curve. The OC , ASN , and AOQ curves are not needed by the 
scorer. They are included here only to provide the reader with a com- 
plete picture of how the sampling plan operates. : 

In applying the method, the scorer selects at random a batch of 
test papers from the group under consideration equal in number to 
the maximum value of n in the first column. If it is more convenient, 
he may select smaller batches in sizes corresponding to the groupings 
of n in the first column of the table. He begins rescoring the test 
papers one by one and keeps a running count of the total number of 
incorrectly marked papers (D). As long as D remains between A 
and FR, the scorer continues to rescore additional papers. If at any 
point D becomes equal to R , sampling is stopped, the group of papers 

















ALLYN W. KIMBALL 5 


is rejected, and each paper in the group is rescored. If at any point 
D becomes equal to A , sampling is stopped and the group of papers 
is accepted. The occurrence of dashes in the A column means simply 
that it is impossible to reach a decision of acceptance within the corre- 
sponding range of ». For example, in Table 1, it is impossible to 
reach a decision of acceptance before the 90th paper is rescored. If 
up to that point no incorrectly marked papers had been observed, the 
group would be accepted. One additional instruction is necessary. It 
almost never happens that the scorer reaches the limit of the table 
without making a decision. However, if it does occur, the sequential 
process is stopped or truncated at that point, and the group of papers 
is accepted or rejected by applying the rules for truncation given at 
the bottom of each table. 

The procedure for preparing reference tables is exceedingly sim- 
ple and requires very little computation. The first column in the ref- 
erence table contains values of n, the sample size. In the sequential 
method, of course, starts at one and increases integrally until the 
process is terminated and the group of papers is accepted or rejected. 
The second column contains the acceptance numbers (A) and the 
third column the rejection numbers (RF), corresponding to values of 
n. In order to determine A and R for a given value of n , we compute: 
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must be computed. These are the average sample sizes which would 
obtain for groups containing proportions p, and p,, respectively, of 
incorrectly marked papers. If # is the larger of %, and Ty , trunca- 
tion occurs at approximately 3%. At that point if D is nearest to A, 
the group is accepted; if D is nearest to R , the group is rejected; and 
if D is midway between A and R, a coin may be tossed to reach a 
decision. 

Figures 1, 2, 3, and 4 show the OC , ASN , and AOQ curves cor- 
responding to the reference tables in Tables 1, 2, 3, and 4, respec- 
tively. The abscissa of each curve is p, the proportion of incorrectly 
marked papers in the lots submitted for rescoring. The ordinate of 
the OC curve is the probability that a group of papers with a pro- 
portion p of incorrectly marked papers will be accepted. For a group 
containing no incorrectly marked papers, the probablity is one that it 
will be accepted. Likewise, for a group in which every paper is in- 
correctly marked, the probability is zero that it will be accepted. A 
little reflection will reveal that two other points on this curve are 
(p.,1—a) and (p,, 8). One of the most useful properties of the OC 
curve is that it provides a basis for the comparison of different sam- 
pling plans providing the same degree of protection. 

The ordinate of the ASN curve is the average sample size or ex- 
pected sample size for groups containing a proportion p of incorrectly 
marked papers. By referring to the ASN curve, the test administra- 
tor can form judgments concerning the applicability of the method 
in terms of the relative magnitudes of the expected sample size and 
the group size. This point is discussed more fully in section 5. 

The ordinate of the AOQ curve is the average proportion of in- 
correctly marked papers which will be present in the groups after 
inspection. In each case the ordinate is less than the corresponding 
value p, since groups which are rejected by the sequential method 
are completely rescored and freed of any incorrectly marked papers. 
The maximum ordinate of the AOQ curve is called the average out- 
going quality limit (AOQL). This implies that no matter what the 
quality of the groups submitted may be, the average proportion of 
incorrectly marked papers in the groups after inspection will not be 
greater than the AOQL. In general, it will be substantially less than 
the AOQL. 
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5. Scope of the Method 


The applicability and usefulness of this method are directly pro- 
portional to the size of the group of papers submitted for checking. 
One of the assumptions made in the mathematical derivation of the 
method is that group size is large compared to sample size. This 
places certain restrictions on its use. If the sample size is more than 
about one-fourth the size of the group, too much rescoring will result. 
For the cases illustrated in Tables 1, 2, 3, and 4, group sizes as low 
as 400 or 500 can be used without fear of excessive rescoring. Even 
for groups of sizes moderately less than 400, the sequential method 
will probably still result in smaller average sample sizes than the cor- 
responding single sampling plans which take group size into account. 
It should be mentioned that it is possible to derive a sequential sam- 
pling plan which takes group size into account in determining the 
acceptance and rejection criteria. However, the complexities involved 
in applying this plan are so great as to render it impracticable. 


Reference to the average sample size curves shows that for groups 
of quality less than p, and greater than », , the amount of rescoring 
will often be much less than for groups ot quality between p, and p, . 
In other words, groups which are well scored or poorly scored will 
usually be accepted or rejected with very little rescoring. In practice, 
the groups submitted for rescoring will vary in quality and the “true” 
average sample size will be a composite of several average sample 
sizes, one for each quality of group submitted.- Since the quality of 
the groups is never known a priori, the “true” average sample size 
for a given situation cannot be computed. Generally, most of the 
groups submitted will contain very few incorrectly marked papers and 
the actual amount of sampling required will be small. Furthermore, 
whatever the amount of sampling required may be, it will be less on 
the average than what would be required under any of the corre- 
sponding single, double, or multiple sampling plans. 

The foregoing discussion provides a basis for determining the 
scope of the method. There is little doubt that the method would 
be most useful in large-scale testing programs in which groups of 
about 400 or more test papers are encountered. In this case, there is 
assurance that the risks involved conform with those stipulated in 
the design of the sampling plan and that the amount of rescoring 
which will be required is a minimum among all possible sampling 
plans providing identical protection. The method may also be ap- 
plied to situations in which group sizes are less than 400. In this case, 
the risks are predicted with an accuracy equal to that in the previous 
case. The only substantial difference between the two cases is that 
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when group size is small relative to sample size, the amount of sam- 
pling of isolated groups may be excessive. If most of the groups are 
expected to be either very good or very bad with respect to propor- 
tion of incorrectly marked papers, even this objection disappears. In 
addition to its sound mathematical and statistical basis, the method 
incorporates simplicity, ease of application, and reliability. Few sta- 
tistical tools combine so many desirable features. 


6. Comments on the Theory 


The statistical theory underlying this method is due mainly to A. 
Wald (1). Followinw the Neyman-Pearson theory of testing hypoth- 
eses, the probability ratio method of determining acceptance and re- 
jection criteria was employed, and the results are supported by full 
mathematical rigor. Nevertheless, there are a few points which war- 
rant clarification. The procedure for truncation is not considered in 
the development of the pure sequential process. It is employed in 
sequential sampling for practical reasons only. The probability is 
unity that the sequential process will be terminated with acceptance 
or rejection of the group before n reaches «. However, it is possible 
in an isolated case for n to become very large before a decision is 
reached. It is to avoid this possibility that sequential sampling is 
usually stopped or truncated at a reasonable sample size. According 
to Wald, the effect of truncation at 3% on the operating characteristic 
curve, and hence on a and f, is negligibly small, since it is almost cer- 
tain that sampling will terminate before n reaches 3%. The problem 
has not yet been adequately treated theoretically, but results in the 
form of upper bounds on a and f have been obtained for special cases 
and give no cause for apprehension. 

The operating characteristics and average sample size curves are 
approximations in that only fifteen points were computed for each 
curve. They do not provide a basis for precise estimation but do add 
to the characterization of the test. The formulas used in computing 
average sample sizes and acceptance and rejection numbers involve 
the quantities a, 8, p., and p,, and are given in section 4. The for- 
mulas necessary to compute the OC , ASN, and AOQ curves are more 
extensive and will not be presented here. 

It would be desirable for purposes of comparison to give sample 
sizes required for single sampling plans providing the same degree 
of protection as the sequential sampling plans given in section 4. In 
general it is not possible to design a single sampling plan which pro- 
vides exactly the same degree of protection. It is possible to pass the 
OC curve through one of the two points (p,, 1— a) and (p,, f), but 
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usually not through both points simultaneously. In practice, the OC 
curve is passed through one point and as close to the other point as 
possible. For the plans discussed in this paper, the author has found 
that the single sample size obtained when the OC curve is passed 
through one point is quite different from that obtained when the OC 
curve is passed through the other point. Consequently comparisons 
between the sequential plan and the single sampling plan have not 
been included. For general discussions of differences among types 
of sampling plans, the reader is referred to (3) and (4). 


7. Summary 


As a special case of its general use in psychological test work, 
the sequential sampling method is applied to the problem of checking 
and rescoring groups of psychological test papers. This method has 
been chosen in preference to others providing equal protection be- 
cause it is shown to be superior with respect to certain important re- 
quirements. Computations necessary to apply the method are de- 
scribed and several sampling plans are illustrated. Finally, the use- 
fulness and applicability of the method in psychological testing pro- 
grams are discussed. 


TABLE 1 
Reference Table for Plan Defined By 
a= .05 B= BL Po = 001 Pp, — .06 








a 


n 


1-19 
20-89 
90-97 
98-168 

169-175 
176-246 
247-253 
254-270 
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Procedure for Truncation at n = 270 


If D=8, accept the group. 
If D =4, reject the group. 
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TABLE 2 
Reference Table for Plan Defined By 
a= .05 B=.01 Pp, — .005 Pp, = .05 








n A 





1-37 
38-88 
89-99 

100-139 
140-150 
151-190 
191-201 
202-241 
242-252 
253-292 
293-303 
304-344 
345-354 
355-360 
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Procedure for Truncation at n = 860 


If D=6, accept the group. 
If D=8, reject the group. 
If D =7, toss a coin to decide. 


TABLE 38 
Reference Table for Plan Defined By 
e210 == Hl P= .001 9, == 








n A 





1-33 
34-89 
90-112 

113-168 
169-190 
191-240 
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Procedure for Truncation at n = 240 
If D = 2, accept the group. 
If D =8, reject the group. 
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TABLE 4 
Reference Table for Plan Defined By 
a—.10 p=.01 Py = 005 Pp, — .05 








HN 
wv 


n 





t 

2-51 

52-97 
98-102 
103-148 
149-153 
154-198 
199-204 
205-249 
250-255 
256-300 
301-305 
306-330 
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Procedure for Truncation at n = 830 


If D = 5, accept the group. 
If D=7, reject the group. 
If D =6, toss a coin to decide. 








12 PSYCHOMETRIKA 




















Probability Average 
of Sample 
Acceptance Size 
1.00 100r 
.80 80 
60 60 
40 40 
.20 20 
iL r i i 1 1. 1 j 
Oo. .O! 02 .03 04 O05 0 Ol 02 03 .04 $8 
Quality of Group Submitted (p) Quality of Group Submitted (p) 
Average Quality 
of Group After 
Inspection 
005 Fr 
OOSF 
oi AOQ 
.003Fr 
.002 H 
-001 


4. rl 


L i L 
.¢] Ol 02 03 .04 .O05 
Quality of Group Submitted(p) 





Figure |. Representative Curves for Sampling Plan defined by 
a=.05, @ = .Ol, p,=.00!, p, = .05 
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THE CALCULATION OF SUMS OF SQUARES FOR INTER- 
ACTIONS IN THE ANALYSIS OF VARIANCE 


ALLEN L. EDWARDS AND PAUL HORST 
THE UNIVERSITY OF WASHINGTON 


A method is described for the calculation of the sum of squares 
for a second-order interaction. It is then shown that the method is 
general and can be used for the calculation of the sum of squares for 
any higher-order interaction. 


I 


Most of the current statistical texts, for example, McNemar (3), 
Snedecor (4), Lindquist (2), and others which are concerned with 
the analysis of variance describe methods for calculating the sums 
of squares for simple or first-order interactions. When a single sec- 
ond-order interaction sum of squares is discussed, however, it is 
usually said that this may be obtained by subtraction of the sums of 
squares for the main effects and the simple interactions from the to- 
tal sum of squares. Since, in some experiments of factorial design, 
it may be necessary to calculate directly the sum of squares for sec- 
ond or higher-order interactions, it seems worth while to demonstrate 
a method for doing this. 

For simplicity, we shall take the case of a factorial design in- 
volving but a single second-order interaction and carry out the usual 
analysis of variance, obtaining the second-order interaction sum of 
squares by subtraction. We shall then show a method for the direct 
calculation of this sum of squares and prove that it can be general- 
ized for any interaction sum of squares, regardless of order. 

Let us assume that we have a factorial design in which A is 
varied in 4 ways, B is varied in 3 ways, and C is varied in 2 ways. 
Then we shall have (4) (3) (2) = 24 combinations of variables, each 
combination corresponding to a particular experimental condition. 
One replication of the experiment will involve 24 observations, and 
the 23 degrees of freedom available would be allocated in the fol- 
lowing way: 
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Sum of squares af 
Main effects: A 3 
B 2 

C 1 

First-order interactions: AXB 6 
AXC 3 

BXC 2 

Second-order interaction: AXBXC 6 


Assume that the outcomes of the experiment are as given in 
Table 1. 





























TABLE 1 
Outcomes of a 4X3 X2 Factorial Design 

A. A, A; A, Sum 

a 60 90 94 86 830 

C, B, 54 92 98 96 840 
B, 70 76 80 60 286 

Sum 184 258 272 242 956 
A, A, A, A, Sum 

B, 58 72 78 84 292 

C, B, 76 82 74 64 296 
a. 66 56 iz 78 272 

Sum 200 210 224 226 860 

Sum 884 458 496 468 1,816 





The sum of squares between the experimen‘ al conditions will be giv- 
en by 


__ 1,816)? 


= 3,917.33. 
24 


(60)? + (54)? + (70)? +---+ (78)? 


It is this sum of squares, 3,917.33 , based upon 23 degrees of free- 
dom, that is to be further analyzed into the component parts enu- 
merated. From the data of Table 1, we may set up a table for vari- 
able A and variable C , ignoring the B classification. Thus we obtain 
Table 2. The two sums, 956 and 860, are the sums of scores for the 
C variable, and the sum of squares for this variable will be given by 


(956)? + (860)? (1,816)? 








= 384.00. 


12 12 


And, similarly, the sum of squares for the A variable will be given by 
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TABLE 2 
Outcomes of a 4X3X2 Factorial Design with the B Classification Ignored 
A, A, A, A, Sum 
C, 184 258 272 242 956 
C, 200 210 224 226 860 
Sum 384 468 496 468 1,816 





(384)? (468) (496)* (468)? (1,816)? 
+ + + -- 

6 6 6 6 24 
The sum of squares for the A X C interaction may also be ob- 
tained from Table 2. We first calculate the sum of squares between 
the 8 sums entered in the cells of the table and obtain 
184)? 200)? 3 : 1,816)? 
(184) . ) , (258) ieee (226) _' 816) 

3 3 3 3 24 
Then the sum of squares for the interaction of A and C may be ob- 
tained by subtracting the sum of squares for A and the sum of 


squares for C , from the sum of squares between the 8 sums of Table 
2. Thus the interaction sum of squares for variables A and C will be 


given by 


= 1,176.00. 





= 2,029.33. 








2,029.33 — 1,176.00 — 384.00 = 469.33 . 














TABLE 8 
Outcomes of a 4X32 Factorial Design with the C Classification Ignored 
A, A, A, A, Sum 5 
i. 118 162 172 170 622 
B 130 174 172 160 636 
B, 136 132 152 138 558 
Sum 384 468. 496 468 1,816 





We may now go back to the data of Table 1, and set up another 
table showing the classification of A and B, with the C classification 
ignored. In this manner we obtain the data of Table 3. Then the sum 
of squares for the B variable will be given by 


(622)? rs (636)? x (558)? (1,816)? 
8 8 8 24 


To obtain the interaction sum of squares for A and B, we first cal- 
culate the sum of squares between the 12 sums of Table 3. Thus 


= 432.38. 
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118)? 130)? 136)? 138)? 4 
(118)? (130)? | (136)? |, (188)? _ (1,816) 
2 2 2 2 24 


Then the desired interaction sum of squares for A and B may be ob- 
tained in the usual way, by subtraction. Thus 


2,129.33 — 1,176.00 — 432.23 = 521.00. 








= 2,129.33. 














TABLE 4 
Outcomes of a 4X3X2 Factorial Design with the A Classification Ignored 
B, B, B. Sum 
C, 330 340 286 956 
C, 292 296 272 860 
Sum 622 636 558 1,816 





To obtain the sum of squares for the interaction of variables B 
and C , weset up Table 4 for these variables, ignoring the A classifica- 
tion. Then the sum of squares between the 6 sums of Table 4 will be 








given by 
330)? 292)? 340)? 272)? 1,816) 2 
nati +! ) +S a. -. )* = 879.38. 
4 4 4 4 24 


By subtracting the sum of squares for the B variable and the C vari- 
able, which we have already calculated, we obtain the interaction sum 
of squares for B and C. Thus 


879.33 — 432.33 — 384.00 = 63.00. 


The sum of squares for the second-order interaction, A X BX C, 
is then obtained by subtracting the sums of squares for variables A , 
B, and C, and the simple interactions, AX B,AXC,andBXC, 
from the sum of squares based upon the variation between the 24 ex- 
perimental conditions. Thus 


3,917.33 — 1,176.00 — 432.33 — 384.00 — 521.00 
— 469.33 — 63.00 = 871.67. 


The method described is the one usually given for the calculation 
of a second-order interaction sum of squares, when only one such in- 
teraction is involved in the experimental design. 


II 


Goulden (1) presents, without proof, a method of direct calcu- 
lation for a second-order interaction, but does not generalize the meth- 
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od for higher-order interactions. We shall illustrate the method with 
the data at hand and then prove that the method is general in appli- 
cation. 

The interaction to be calculated is that of A X B X C. Consider 
only the data in the upper portion of Table 1. From the data there it 
is possible to calculate a sum of squares between the 12 cell entries. 
A second sum of squares could be calculated between the 3 rows, and 
a third sum of squares between the 4 columns. The row sum of 
squares would be the sum of squares for the B variable under condi- 
tion C,. The column sum of squares would be the sum of squares for 
the A variable under condition C,. If we subtract these two sums 
of squares from the sum of squares between the 12 cells, we shall 
have an interaction sum of squares. The interaction sum of squares 
thus obtained will be the interaction for A <X B under condition C,, 
and this may be symbolized by C,(A X B). The calculations described 
could be repeated for the entries in the lower portion of Table 1, to 
arrive at the interaction sum of squares, C.(A X< B). Carrying out 
these operations, we obtain: 


C,(A XB): 


Between cells = (60)? + (54)? + (70)? 
(956)? 














+--+» + (60)?— = 2,646.67 
330)? 340)? 286)? 956)? 
Rows sant ) as ) ‘ne ) — = 412.67 
4 4 4 B iy 
184)? 258)? 272)? 
Columns ms. ) +4 ) iS ) 
3 3 3 
242)? 956)? 
+S ) a ) = 1,494.67 
3 12 


Interaction C, (A X B) = 2,646.67 — 412.67 — 1,494.67 = 739.33 








C.(A XB): 
Between cells = (58) + (76) + (66) 
2 
+.-.-+ (78) — = = 886.67 
12 
292)? 296)? 272)? 860)? 
Rows lon 4 eg me ) = 82.67 


4 4 4 12 
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200)? 210)? 224)? 
Columns — m an = 





= 150.67 





226)? 860)? 
 226)* _ (860) 
o 


12 
Interaction C, (A X B) = 886.67 — 82.67 — 150.67 = 653.33 


Summating the interactions A X B under the separate C condi- 
tions, we have 


XC(A X B) = 739.33 + 653.33 = 1,392.66 . 


Now we have already calculated the interaction between variables A 
and B under the combined C conditions, i.e., from the data of Table 
3, and this we have found to be equal to 521.00. Then the second- 
order interaction sum of squares, A X B X C, may be obtained by 
subtracting the A X B interaction sum of squares from the sum of 
the interactions for A < B under the separate C conditions. Thus 


Interaction A X BX C=SC(AXB)—AXB, 
and substituting in the above, we obtain 
Interaction A X B X C= 1,392.66 — 521.00 = 871.66 , 


which checks, within errors of rounding, with the value previously 
found for the second-order interaction. 

The method described above for calculating the sum of squares 
for a second-order interaction can be varied to fit the needs of a par- 
ticular design. For example, the interaction sum of squares might 
have been obtained by calculating the A X C interactions under the 
separate B conditions, summing, and then subtracting the A X C in- 
teraction for the combined PB conditions. Similarly, we could have 
calculated che B X C interactions under the separate A conditions, 
added these, and then subtracted the B X C interaction for the com- 
bined A conditions. Thus, in general, a second-order interaction will 
be given by 

Interaction A X BX C= SA(BXC)—BXC 
= SB(AXC)—AXC 
=SC(AXB)—AXB 


and a third-order interaction will be given by 
Interaction A X BX CX D=S\A(BXCXD)—BXCXD 
=SB(AXCXD)—AXCXD 
=3SC(AXBXD)—AXBXD 
=SD(AXBXC)—AXBXC 
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and a similar series of equations may be written for any higher-order 
interaction. 


III 


If we prove in particular that SD(A X BX C)—-AXBXC= 
A XBXC XD, the general case will be evident. 
We let 


X.... be the mean of all observations. 


nee De the mean of all observations for the hth category of A 
and X.;.., X..;., X..., have similar interpretations for classi- 
fications B , C , and D, respectively. 


>| 


Xji.. — the mean of all observations in the Ath classification of 
A and the ith classification of B. We have similar interpre- 


tations for : am 9 ; an ; X.jj. ; X ick 9 X jk . 


Xyij;. = the mean of all observations in the hth classification of A , 
the ith of B , the jth of C. We have similar interpretations 


for Xhik ’ Xh. jx ’ X .ijx e 


We then indicate the fundamental observation equations by 





| an (1) 
Pikes = : a = : = 

ek... (2) 
Hive = ae spon ; 

Lni = } = R.. = | on ae ; 

Zh j == Ky. — Xp... — X..;. a ios 

Lhek = as ates is = Pe = X.... 

Z.ij oe Tas, i # — X..;. +X... (3) 
Zien = Xvin — Xvi — Xen + Kae 

2.5K 7 — X..;. = >. CaE + X.... 

Lnij- ——— a, nine ai — X.i;. aE ) + ; + ; a = Fis 

Lik = | an — ; ae a , a a } = : + - — p aa (4) 


Zin = Xin wae Xj. — Xion — X.njn = X}... + X..j. + X vent sia 
Zuijk = X ijn — Xoij. — X54 — X05 ae es — X..;. +- Ge — X.... 
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Liz = Xniix — (Xnij. + Xpion + Xin + Xun) = at 
A A SE Se a ee (5) 
— (Xpew + Xejee + Kaj. + Keg) + Xie 


The main effects are obtained by squaring and summing for 
equations of type (2). Similarly, first-order interactions are obtained 
from equations (3), second-order from (4), and third-order from (5). 

We wish to prove that 


SP rin + SZ xij. = 
on ie m ie as ms = (6) 
DS (Xnijx — Xii« — X he jk —_ X wi 5x + Xp0a t+ Xuan + X.. 5% — X..x)?, 


where the summations are over all observations in the sample. 
Since 7}; ;, and Z;,;;. are orthogonal, the equation is proved if we 
prove 


Lniik - Znij- = Xnijx ne Xai — Xnix — Xun 1. } am 2 or 
~ bs (7) 
+ Xu. — Xu. 


Adding the first equation of set (4) to equation (5) gives (7). 

If we had added the second equation of (4) to (5), each term on 
the right-hand side of (7) would have had a “j” subscript rather than 
a “k.” Similarly, the third and fourth equations would have yielded, 
respectively, an “i” and an “h” subscript for each term on the right 
of (7). These would have proved the other three equations for the 
A X B X CX D interaction. 

To prove the relations for any number of classifications, it is suf- 
ficient to point out that if we add to the highest-order equation any 
equation from the immediately preceding set, then all right-hand 
terms will have at least one subscript in common. This is the sub- 
script of the classification for which the interactions involving all 
other classifications are calculated separately. Summing these and 
subtracting the highest-order interaction excluding the variable 
yields the highest-order interaction for the complete set of classifica- 
tions. 
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A FACTORIAL STUDY OF THE MULTIPHASIC, 
STRONG, KUDER, AND BELL INVENTORIES 
USING A POPULATION OF ADULT MALES* 


WM. C. COTTLE 
UNIVERSITY OF KANSAS 


In a centroid factor analysis of the Multiphasic, Strong, Kuder, 
and Bell inventories using a population of 400 adult males, eight 
common factors dealing with aspects of personality as measured by 
these instruments were isolated. Seven of the factors were meaning- 
ful and one was a residual. This study indicates little overlap be- 
tween thé two personality and the two interest inventories. It would 
appéar’that’ factors found in these instruments measuring aspects 
of personality are dichotomous in nature and are not common to 
the two types of instruments included in this study. That is, two of 
the factors were common to the two personality inventories, and five 
of the factors were common to the two interest inventories. 


Purpose 


The purpose of this study was to determine whether there are 
any common factors measured by the Minnesota Multiphasic Person- 
ality Inventory, short group form; the Strong Vocational Interest 
Blank for Men, using scores for groups I, II, V, VIII, IX, X, and the 
three non-occupational keys; the Kuder Preference Record, including 
the masculinity-femininity scale, and the Bell Adjustment Inventory, 
student form. These variables are described briefly in Table 1. It was 
not anticipated that this study would contribute extensively to theory 
of personality, but rather that it would indicate possibilities for fur- 
ther research with personality and interest inventories used in edu- 
cational and vocational counseling. 


The Sample 


The population sample was composed of 400 male veterans of 
World War II who were referred for testing and counseling at the 
Psychological Services Center of Syracuse University under the Vet- 
erans testing program. These veterans would appear to comprise a 
fairly normal sample of college adult males in the immediate post-war 


*Abstract of dissertation submitted at Syracuse University, January, 1949, 
in partial fulfillment of requirements for the degree of Doctor of Education. 


25 





26 





PSYCHOMETRIKA 


period. The following five items are offered in substantiation of this 
statement: 


‘ 


Competent counselors acting as veterans’ appraisers recom- 
mended training at the professional level for 237, or 59%, of 
the 400 cases. There were 131, or 33%, recommended for 
training at the semi-professional and managerial level. The 
remaining 32, or 8%, were recommended for training at low- 
er occupational levels. 

The age of these cases at the time of testing as reported on 
V. A. form 1902 ranged from 18 to 48. The median age was 
23.3 years and the mode was 23 years. The mean age was 
23.8 years. 

The school grade completed at the time of testing as reported 
on V. A. form 1902 ranged from grade eight to grade nine- 
teen. The median grade completed was 12.9 and the mode was 
grade 12. The mean grade completed was 12.9. In this group 
there were 23 who had not completed high school and 46 had 
completed a four-year college course. The educational attain- 
ment of five was unknown. 

The range of I.Q. for 388 of the cases was from 80 to 135 on 
the Higher Form of the Otis Self-Administering Test. The 
median I.Q. was 118, with the mode being 122. The mean I.Q. 
of this group was 115.5 for the 388 cases tested on the same 
instrument. 

When we consider the number of cases with medical diagnos- 
is of disability, it is well to bear in mind that 260, or 66%, of 
the cases were tested under Public Law 346. This means that 
almost two-thirds of the cases when released from service 
were judged physically and mentally fit. (This corresponds 
closely to the fact that 240, or 60% of the cases had no deviate 
scores on the MMPI.) There were only 140, or 34%, with a 
disability rating. Only one-fifth of these 140 cases tested un- 
der Public Law 16, or a total of 29 cases, had been diagnosed 
as neuropsychiatric cases by an armed services medical 
rating board. Thus, at the time of discharge only 7% of the 
total group were classified as being neuropsychiatric in 
nature. 


Statistical Methods 
Let us consider now statistical methods used in handling these 
data. 
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Any cases with raw scores greater than 20 on the question scale 
(?), or with raw scores greater than 9 on either the lie scale (L) or 
validity scale (F) of the MMPI were eliminated from the sample. 

The masculinity-femininity score of the Kuder was computed for 
the 400 cases according to the procedure given in Kuder’s revised 
manual (5). 

Pearson product-moment correlations were computed using raw 
scores of the 400 cases on all thirty-four variables. (The question 
score was omitted in these computations because of the limitations 
which had been placed upon this scale.) No tests for linearity were 
made on these correlation coefficients. 

These correlation coefficients were then entered into a correlation 
matrix of 1122 cells. One half of this matrix is shown in Table 2. 
There is marked relationship shown in this matrix between many of 
the scales of the Kuder and the Strong, and between the scales of the 
MMPI and the Bell, which counselors would expect to be related. Be- 
tween the personality and interest inventories, however, there were 
only low positive or negative correlations and more frequently these 
coefficients were zero. 

Thurstone’s complete centroid method was used in factoring this. 
correlation matrix (6). Extraction of factors was continued until the 
standard deviation of the residual matrix was slightly less than the 
standard error of the average correlation coefficient in the original 
matrix (.0477 as compared with a standard error of .0484). This re- 
sulted in eight sets of centroid loadings shown in Table 3. 

Thurstone’s single-plane method of rotation was used. Inspection 
of the final plots of the test projections on the normals of one plane 
versus another indicated that so many adjustments appeared neces- 
sary that a new method of rotation was attempted. By the method of 
minimizing weighted sums, further rotations were completed. Final 
adjustments produced the transformation matrix A,, shown in Table 
6. Correlation of reference vectors is shown in Table 7. Table 8 gives 
the inverse of Table 6. 

The diagonal matrix D used in the formula DA,,* is given in 
Table 9. Tables 10 and 11 show direction cosines for the primary fac- 
tors and the intercorrelation of the primary factors. These primary 
factors are oblique. 

The rotated factor matrix is shown in Table 4 and the factor 
pattern in Table 5. Figures 1-4 show projections on the normals to 


each plane. 


The Meaning of Factors 
Perhaps it might be well to note here that any of the conclusions 
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and generalizations which may be made here are understood to apply 
to this particular sample only. They cannot necessarily be applied to 
any other population sample until further research verifies or denies 
the results of this pilot study. With this in mind we proceed to the next 
step—an attempt to identify the factors. 

It is recognized that identifying, analyzing, or attaching labels to 
factors is a hazardous and difficult undertaking at best. This is true 
because no two psychologists can agree completely upon a lexicon of 
trait names and because we are each interpreting our findings and 
those of others from an individualized frame of reference. We cannot 
agree on a common terminology of traits. The important thing to re- 
member is that these eight factors have been isolated in this study 
and any attempt to name them should be done only to promote their 
acceptance and use by others. As Guilford has said (1) “The naming 
depends upon features that the clustering measures seem to have in 
common and that are unique to them. Attaching a meaningful label 
facilitates communication and systematic thinking. Any label or defi- 
nition of a factor should be regarded as a hypothesis, in the same man- 
ner that any trait name is a hypothesis.” It seems wiser in this study 
to refrain from giving any labels to these factors, but rather to explore 
their meaning as it is indicated by the item content of scales having 
the highest saturations of each factor. This would provide evidence 
concerning the advisability and the direction of future research. 

Thurstone has indicated that he considers a rotated factor loading 
of + .3 or .35 to be significant (2). The factor pattern in Table 5 is 
constructed on the assumption that a loading of + .33 is significant 
for the purpose of helping to identify the factors. 

Perhaps before we consider the matter of analyzing factors, we 
should consider the thirty-four scales of the four inventories and the 
labels or names that have been attached to them. 


The Strong Vocational Interest Blank for Men by construction is 
valid for the labels attached to each scale (3). Strong determined the 
items which contributed to each occupational interest pattern by the 
manner in which they discriminated between the interests of success- 
ful people in a specific occupational group and a group considered 
typical of “men-in-genera!l.” Therefore the items of the accountant 
scale, for example, are typical of the interests of accountants as dif- 
fering in nature from a general male group and can be classified as 
being of a business detail nature. Strong classified the group keys by 
factorial analysis showing the keys to be grouped together. It would 
seem reasonable to accept his findings, as Darley does (4), and say, 
for example, that the occupations in Group VIII (accountant, office 
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worker, and purchasing agent) are “Business Detail” in nature. 

Kuder developed the Preference Record by statistical methods so 
that the items of each scale correlate highly with each other and the 
scales themselves have low intercorrelation with other scales (5). The 
items of each scale are activities and a person scoring high on a speci- 
fic scale has expressed a high degree of interest in the activities com- 
prising that scale. The labels attached to each scale seem consistent 
with the activities of that scale; that is, they appear valid for that 
scale. The scales of the Strong and Kuder that one would expect to be 
correlated show this relationship in Table 2. For example, the M.F. 
key of the Strong correlates .63 with that of the Kuder. The Mechani- 
cal scale of the Kuder correlates .68 with the M.F. scale of the 
Strong.* 

The items included in each scale of the MMPI are items corres- 
ponding to the syndromes characteristic of the trait names assigned 
to each scale and so used in psychology and psychiatry. The writer 
has investigated and separated each item into its corresponding 
scale or scales. They are not listed here because their knowledge by 
non-professional persons might invalidate the use of the instrument. 

The items of the Bell have been grouped in the same fashion and 
are items dealing with reactions to verbal situations symbolizing 
names assigned each scale. They have not been listed here for the 
same reason given above for not listing MMPI items. 

With these facts in mind we attempt to identify the factors iso- 
lated in the study. 

Reference to Table 4 shows that the normal to Plane A has bi- 
polar test projections on it as follows: 


Positive Negative 
Strong Group II ................ 44 Strong Group X ............ —.50 
Strong Group MF ............... .69 ROARCIEE BIEN a acaccsastecliccestosece —.35 
Rader Meo. 2.0.02. .68 Ruder “MGR .iscccccnncicn —.46 
MGCP SCION. ° sc.c.cccecc cscs 56 
LCG age |! nee Hee 82 


An analysis of the individual keys contributing positive loadings 
shows that they are made from items on interest scales dealing with a 
liking for science, mechanical activities, mathematics, and drawing. 
Strong says that the masculinity scores are defined as an interest in 


*This is a higher degree of relationship than exists between any scale of the 
Strong and the Strong M.F. scale. The conclusion here is that since the Mechani- 
cal scale of the Kuder consists of acceptance-rejection of mechanical types of ac- 
tivities, a large portion of the variance of the M.F. scale of the Strong can be 
explained by acceptance-rejection of mechanical types of activities. This corre- 
sponds to Strong’s definition of masculinity-femininity listed below and to the 
findings of Terman and Miles. 
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things or objects and feminine scores as interest in persons or person- 
alities (3). Terman and Miles point out that there is a very pro- 
nounced relationship between masculinity and mechanical pursuits at 
every educational level (7). Studies of mechanical tests have isolated 
numerical and spatial factors (8). These findings plus the items on the 
Kuder scales having positive loadings would lead one to surmise that 
a positive score on this factor might mean a liking for activities of a 
mechanical and spatial nature, or “things and objects” (3). 

An analysis of the keys contributing negatively to Plane A shows 
that they are composed of items involving a common element of deal- 
ing with people, animate objects, or linguistic activities. This leads 
one to suspect that the factor might be that of verbalization, communi- 
cation, or “people.” This factor resembles the factor “Things versus 
People” that previous research has identified in the Strong Vocational 
Interest Blank. 

Projections on the norma] to Plane B are as follows: 

















Positive Negative 

MMPI F 55 NG As ecg os —.45 

De en oes (Ge 36 

Pd 51 

Pt -76 

Se 83 

Ma 56 
Ri I ho sr .60 

_ Seen 87 

LL ES GE ce em 33 

Emotional ................ .67 


The MMPI items which produce scores on these scales are those 
items which indicate the more serious areas of emotional upset or 
maladjustment. The greatest projections of the MMPI scales are 
those which Meehl (9) has called the “psychotic end of the curve” 
and which Gough refers to as the “psychotic phase” of the MMPI (12)- 

The items of the Bell contributing to scores on the keys with the 
heaviest saturation are also of this nature, indicating poor home ad- 
justment and emotional upset. There is not much overlapping of items 
on the scales with saturations of this factor. The greatest overlapping 
occurs in fourteen items of the 64-item F scale with the 78-item Sc 
scale. 

The negative loading on the Lie scale of the MMPI is rather dif- 
ficult to reconcile with this interpretation unless one operates on the 
hypothesis that people composing this sample are too sophisticated to 
be apt to answer the type of items comprising this scale in such a 
manner as to secure a hig score. Another point to be considered here 
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is that the people securing deviate scores in this scale were eliminated 
from the sample. 

This factor evidently includes a variation of items concerned with 
expression of more serious maladjusted tendencies. The word tenden- 
cies emphasizes that the group on which the study is based is primari- 
ly a normal group. Some discrimination between the results found 
here and those to be anticipated in an abnormal group seems advis- 
able. 

The normal to Plane C has the following significant test projec- 
tions: 


Positive Negative 
Strong Group VIII ............ .67 Strong Group [ .............. —.37 
Kuder Computational ........ .69 Kuder Soc. Serv. ............ —.40 
Kuder Clerical ................... Rei § 


The items contributing to the keys with positive loadings seem 
to indicate rather definitely that the positive pole of this factor is 
business detail, or a preference for activities of a routine, concrete 
nature in office work requiring quantitative judgement. Negative 
loadings are not pronounced, but might mean a dislike for business 
detail or preference for activities of a more abstract nature found in 
art, social service, or the ideational activities. A common element in 
these activities is that of qualitative judgement. This factor could be 
considered to be discriminating between interest in routine activities 
and those of a varied nature. 

All the loadings considered significant in Plane D are positive. 
They are as follows: 











Positive 
MMPI Hs 61 
D 51 
Hy -70 
Ben HGR sce. oT 


The items contributing to these keys are heavily loaded with 
health and concern over minor personal matters. They correspond 
most closely to that part of the MMPI to which Meehl and Gough have 
referred as the “Neurotic Triad” (Hs, D, Hy) (9, 12). Brozek and 
Erickson have also identified these three scales (Hs, D, Hy) with 
“mild psychoneurosis” (10). 

Twenty of the items making up the 33-item Hs scale of the MMPI 
are identical with twenty of the 40-item Hy scale. They deal exclu- 
sively with health. So do the 35 items of the Bell Health key. This 
factor might be called hypochondriasis, or perhaps, in order to be 
consistent with Factor B, a better term might be neurotic tendencies. 
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The only loadings on Plane E equal to or greater than the cri- 
terion of + .33 are: 
Positive 
CO. 5 eS eae Al 
eee eee oo 


These loadings do not seem strong enough to warrant an inter- 
pretation of this plane. The Kuder masculinity-femininity scale is a 
statistical score derived from weightings of the other nine scales. It 
has no items for analysis. There does not seem to be any possible 
course except to consider this as a residual plane. 

Loadings on the normal to Plane F are as follows: 


Strong Group VIII ............ 48 Strong Group I .............. —.#2 
Strong Group IX ................ 17 Strong Group II ............ —.67 
Kuder Persuasive .............. 74 Kuder Scientific .............. —.62 


The items composing the Group IX scale on the Strong and the 
Kuder persuasive key are activities of a business contact nature. They 
involve liking for contact with people (interviewing and sales), pub- 
lic appearances, business surveys and advertising work, dislike for 
business activities of routine nature (non-contact), and dislike for 
mathematical, scientific, mechanical and cultural activities, writing, 
and reading (except business matters). The negative loadings appear 
on scales which are primarily a liking for activities scientific in nature, 
Groups I and II of the Strong and the Kuder Scientific key. This is 
supported by previous research summarized by Tyler which shows 
that business contact and scientific activities tend to be on opposite 
sides of a continuum (11). 

The normal to Plane G has loadings as follows: 


Positive Negative 
LS, Aa Cees 15 Kuder Mechanical .......... —.33 
Strong Group IX ................ 38 
Strong Group X  ................ 42 


The items contributing to this positive group are those showing 
concern for, or consciousness of, the reaction or esteem of others as 
this esteem contributes to a position of leadership. They might be 
classed as aspiration for those material things associated with leader- 
ship in our society by those who do not carry out tangible, productive 
work. In particular the Occupational Level key of the Strong has 
heaviest weights for dislike of mechanical and engineering activities 
or occupations, dislike for outdoor and common types of amusement, 
dislike for ordinary or odd people, and a liking for mathematics and 
science and for activities and work involving leadership. The scale 
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with negative loading gives some slight evidence of this, the Kuder 
Mechanical key. This latter might be said to be mirroring unconcern 
for a materialistic criterion of success. 

The normal to Plane H exhibits the following significant test 
loadings: 





Positive Negative 
Strong Group V .................. .60 euder Mechs 2.22.65... —.36 
Strong IM .68 Kader Art. 20.255. —.52 
RCN PROM RIN Sc cosas Reese —.40 


The positive group of this factor seems to indicate rather definitely a 
preference for activities which concern working with people for their 
presumed good, as shown by items comprising Group V of the Strong. 
Counselors refer to this by the term “social welfare.” The items with 
high positive weights which comprise the Interest Maturity scale of 
the Strong emphasize a liking for linguistic and personal contact occu- 
pations, school subjects dealing with language and the social sciences, 
cultural and social types of amusement, activities dealing with people, 
a liking for above-average individuals, and self-ratings as a compe- 
tent, reliable, superior worker. Conversely, the items having high 
negative weights indicate a dislike for activities opposite in nature 
to the foregoing items. These items seem to be, then, a iiking for acti- 
vities of a social nature dealing with language and with people in 
social situations. 

The negative group consists of those subtests dealing with a pre- 
ference for activities which can be carried on alone and which are 
generally classed as non-social in nature. A high score on the Bell 
Social key is classified as “retiring.” The negative pole of this factor 
is composed of items dealing with self-centered activities, or egocent- 
rism. 


Conclusions 


While this study does not produce any results at variance with 
previous studies and empirical evidence, the findings do present evi- 
dence concerning relationships among these four instruments which 
many people in the field have long suspected. It suggests also several 
promising areas for further investigation. The conclusions should be 
limited to a college adult male population in the years immediately 
following World War II. Some of the results are supported by pre- 
vious research, some are appearing for the first time and need further 
research for verification prior to use in counseling and education. 

It is suggested that further investigation using similar and dis- 
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similar population samples would be useful in the following ways: 


Z. 


2. 


It would promote more efficient use of these instruments in 
counseling and education. 

It would aid in developing adequate testing devices by con- 
firming this evidence that personality and interest invento- 
ries are complementary, not duplication. 

Maladjustment tendencies in normal populations might be 
identified with educational and occupations patterns. 
Occupational counseling using these interest inventories 
could be improved. 

Improved curricular offerings based on factors found in this 
study would be possible through improved educational coun- 
seling. 

New instruments could be developed to measure factors 
found in this study. 


Let us summarize what we know at present and what further in- 
formation is needed to supplement this study. 


i. 


Factor A resembles the factor “Things versus People” found 
in previous research of Thurstone, Strong, and Carter, Pyles 
and Bretnall using the Strong Vocational Interest Blank for 
Men. 

Factors B and D deal with personal adjustment of normal 
individuals. They appear to correspond to pattern or profile 
analyses by Meehl and by Gough using the MMPI. Two of the 
divisions made by Meehl and by Gough are “neurotic scales” 
(Hs, D, Hy) and “psychotic scales” (Pa, Pt, Sc). 

Factor F appears to be a dichotomy of business contact in- 
terests and scientific interests. This is found also in research 
by Berdie, Darley, Hahn, and others. A summary of previous 
findings which apply to this factor has been published by 
Tyler (11). 

Factor C is apparently a new factor unconnected with previ- 
ous research and needs verification. Its positive pole would 
appear to be concerned with activities and interests of a 
routine, business detail nature. 

Factor G is apparently a new factor concerned with interests 
saturated most highly in Groups IX, X, and the Occupational 
Level key of the Strong Vocational] Interest Blank for Men. 
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It appears to deal with interest in leadership activities. This 
needs verification by further research. 

6. Factor H also appears to be a new factor found in Group V 
and the Interest Maturity scale of the Strong. It appears to 
be comprised of interest in activities of a social nature deal- 
ing with people and language. 

7. All seven factors appear in dichotomous or bipolar form, 
possibly foreshadowing the nature of future research in those 
areas of counseling and education where self-inventories of 
personality and interest are used. 


Thus it is evident that this exploratory study has discovered 
seven common factors in the four instruments used. The solution 
reached by these rotations indicates that, with the exception of the 
projection of the Bell Social key on the normal to Plane H, there is no 
overlap between the personality inventories and the interest inventor- 
ies. Two of the factors are common to the two personality inventories 
and five of the factors are common to the two interest inventories. The 
two factors common to the personality inventories indicate a great 
deal of overlap between the sub-scores on these two instruments and 
suggest that educational or occupational pattern-analysis might be a 
promising approach. The overlap of subtests of the interest inven- 
tories serves to emphasize that the parts of these two tests which one 
would expect to find related do have saturations of a common factor. 

A subsequent step would be to attempt to develop instruments to 
measure pure factors. Research could then be attempted to determine 
the implications of these pure factors for various types of education- 
al training and for occupations. Another approach would be to try 
other instruments in varying combinations with the instruments used 
here in order to determine whether any of the specific factors are 
common to other measuring devices. 
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TABLE 1 


Brief Description of Variables Used in the Study 








Variable Test Author’s Description 
MMPI 
iL Lie Factor Falsify scores by choosing most socially ac- 
cepted response 
2F Validity Check on validity of test-rationale, pertinent 
responses 
3Hs_ |Hypochondriasis Amount of abnormal concern about bodily 
functions 
4D Depression Depth of clinically recognized symptom com- 
plex, depression 
5 Hy | Hysteria Conversion-type hysteria symptoms 
6 Pd |Psychopathic Deviate | Absence of deep emotional response, inability 
to profit from experience, disregard of so- 
cial mores 
7Mf j|Masculinity-Femininity| Tendency toward masculine or feminine in- 
terests 
8Pa_ |Paranoia Suspiciousness, over-sensitivity, delusions of 
persecution 
9 Pt Psychasthenia Phobias or compulsive behavior 
10 Se Schizophrenia Bizarre and unusual thoughts or behavior 
11Ma_ |Hypomania Marked over-productivity in thought and ac- 
tion 
Strong Representative of interests of: 
121 Scientific Artist, psychologist, architect, physician, 
dentist 
13 II Technical Mathematician, engineer, chemist, physicist 
14V Social Welfare YMCA physical director and secretary, per- 
sonnel manager, social science teacher, city 
school superintendent, minister 
15 VIII | Business Detail Accountant, office worker, purchasing agent, 
banker 
16 IX Business Contact Sales manager, realtor, life insurance sales- 
man 
17X Linguistic Advertising man, lawyer, author-journalist 
18IM_ | Interest Maturity Difference in interest with age as associated 
with various occupational interests 
19OL | Occupational Level Interest consistent with varying levels of as- 
piration or occupations 
20 MF | Masculinity- Interest in things and objects vs. people and 
Femininity ideas 
Kuder Degree of interest in: 
21M Mechanical Mechanical activities 
22C Computational Computational activities 
23S Scientific Scientific activities 
24P Persuasive Persuasive activities (business contact) 
25A Artistic Artistic activities 
26L Literary Literary activities 
27M Musical Musica] activities 
28SS_ |Social Service Social service activities 
29 Cl Clerical Clerical activities 
80 Mf |Masculinity-Femininity | Greatest differentiation of the two sexes 
Bell High scores indicate: 
81 Ho |Home Unsatisfactory home adjustment 
82He |Health Unsatisfactory health adjustment 
83 So | Social Persons submissive and retiring in social 
contact 
84Em | Emotional Individuals who tend to be unstable emotion- 








ally 
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TABLE 3 


Factor Matrix F’,* 














Factor Factor Factor Factor Factor Factor Factor Factor 
Test I II III IV V VI Vil Vill 
MMPI 
L 1 —20 —I17 —23 11 30 10 —l1 —07 
F 2 44 45 —04 —10 —09 04 —03 16 
Hs 3 50 48 11 15 24 —26 —32 —16 
D 4 45 37 —16 28 21 18 —21 —08 
Hy 5 21 82 —08 08 47 —19 —36 —04 
Pd 6 41 47 —07 —17 06 05 02 15 
Mf 7 16 46 —30 05 10 13 15 20 
Pa 8 30 32 —19 —02 12 07 12 08 
Pt 9 68 58 04 04 —16 11 07 —06 
Se 10 64 58 04 —16 —16 06 12 09 
Ma 1i 22 82 21 —34 —18 —20 09 09 
STRONG 
: 22 48 —24 —68 07 —35 —17 —05 07 
II 13 61 —53 —23 11 —05 —13 18 22 
V 14 —48 29 —20 —41 38 —16 28 —20 
VIII 15 —17 08 73 22 18 35 14 —08 
IX 16 —61 51 37 —11 —27 10 —33 05 
X 17 —26 40 —46 25 —40 —08 —16 26 
IM 18 —236 15 08 —17 386 —25 43 08 
OL 19 —24 ad 06 20 —35 —40 —27 40 
MF 20 34 58 —51 —13 12 —14 14 —07 
KUDER 
M 21 58 —63 28 —24 09 19 —24 —1l11 
C 22 14 —16 386 47 22 a7 30 11 
Ss 3 46 —59 —04 10 11 —24 15 21 
- 24 —49 26 41 —20 10 21 —36 19 
A 25 34 —14 —35 —09 —29 20 —22 —24 
L 26° —28 27 —34 08 04 06 09 25 
M 27 —04 34 —37 —06 —15 7 16 —10 
Ss 28 —26 12 —29 —21 09 —28 —15 —12 
C1 29 —18 26 41 56 05 26 24 —05 
Mf 30 20 —40 56 —19 31 08 —19 AT 
BELL 
Ho 31 41 43 08 —11 —12 —04 12 07 
He 32 39 42 15 13 11 —23 —12 —20 
So 33 50 12 —08 26 —21 28 12 —20 
Em 34 61 50 10 12 —17 a4 16 —18 
>» 551 517 43 59 12 13 —06 118 





*Decimal points are omitted. 


The sums given in the last row of each column are those 


obtained before the column entries were rounded to two places. 
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TABLE 4* ~ 











Rotated Factor Matrix F,A,, = V,, 
Test A B Cc D E F G H 
MMPI 

L 1 —13 -—465 —02 26 146 -—0l1 —20 01 

F 2 06 55 —12 09 16 03 08 02 

Hs 3 16 36 —0l1 66 -—05 —038 12 04 

D 4 —1l1 18 08 51 22 03 —14 —17 

Hy 5 11 00 —12 70 18 01 02 18 

Pd 6 06 51 —15 15 22 4 —0l1 15 

Mf 7 —23 28 —038 15 33 —03 —04 18 

Pa 8 —05 31 —06 15 20 -—-09 $—10 13 

Pt 9 —04 76 06 10 —01 00 —04 —13 

Se 10 08 88 —08 00 07 01 03 03 

Ma ii 21 546 —17 —18 —13 04 19 20 
STRONG 

I 12 00 12 —38 00 -0l —62 00 —29 

II 13 44 133 -—0l —08 12 -—67 —03 —09 

vVi4-—2 —18 —30 06 —08 06 —26 60 

VIII 15 02 —02 67 —10 01 48 —05 —01 

IX 16 —30 —01 00 —1l1 —0O7 77 38 —05 

xX 17 —50 oi —14 02 13 00 42 —11 

IM 18 —01 —01 06 —06 02 —09 02 68 

OL 19 05 02 00 —03 00 —04 76 03 

MF 20 69 07 18 —19 -—20 —22 —08 02 
KUDER 

M 21 68 —01 —10 00 0 -03 —33 —36 

C 22 12 —01 69 —01 15 —10 00 03 

Ss 23 56 02 07 =—02 08 —62 03 05 

r 24 01 —15 00 05 23 74 21 09 

A 25 —10 03 =—82 00 —ll -06 —29 —52 

L 26 —35 —08 —05 05 29 02 09 21 

M 27 —46 144 —16 —07 00 03 —19 —03 

SS 28 —14 —17 —40 19 —ll —04 02 20 

Ci 29 —31 00 177 —03 00 25 08 —06 

Mf 30 82 —01 12 01 41 16 16 15 

BELL 

Ho 31 07 60 —03 —02 00 —02 08 09 

He 32 07 37 07 37 —17 —05 09 04 

So 33 —16 33 24 —-0 —08 —08 —22 —49 

Em 34 —09 67 18 04 —14 -02 —09 —16 

= 95 582 52 257 180 —17 70 76 

Ch 95 582 52 257 180 —17 70 76 





*Decimal points are omitted. The entries in the two bottom rows were first obtained to four 
places and later rounded to two places. 
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TABLE 5 
Rotated Factor Matrix F,A,, = V,, Showing Loadings of +.33 or More 
Test A B C D E F G H 
MMPI 
L 1 —.45 
F 2 55 
Hs 3 86 61 
D 4 51 
Hy 5 -70 
Pd 6 51 
Mf “f 38 
ra 8 
Pt 9 -76 
Se 10 83 
Ma 11 56 
STRONG 
: eae —.38 —.62 
II 13 44 —.67 
Vi 14 60 
VIII 15 67 48 
IX 16 Lf 38 
X 17 —.50 42 
IM 18 68 
OL 19 -76 
MF 20 .69 
KUDER 
M 21 68 —.33 —.36 
C 22 .69 
s 23 56 —.62 
P 24 74 
A 25 —.52 
L 26 —.35 
M 27 —.46 
SS 28 —.40 
C1 29 Bs i | 
Mf 30 82 Al 
BELL 
Ho 31 .60 
He 382 387 37 
So 33 33 —.40 
Em 34 .67 oo 
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TABLE 6 
Transformation Matrix A,, 
A B Cc D E F G H 






































I .4501 5962 —.0849 1676 .0537 —.3551 —.1474 —.1861 
II —4096  .5142 —.0075  .2347 0614 .3338 .1699 2845 
III 4447 23385  .5063 —.2122 —.2026 4078  .2629 .0289 
IV —.3236 —.2564 7543 2854 03822 —.2321 .2654 —.2768 
v 2289 —.3278 .0668 .6678 4190 —.0285 —.3482 5594 
VI —.3200 —.1278 1917 —.1659 4808 6083 —.5564 —.4564 
VII —.1697 .3285  .38626 —.5506 —.1267 —.4191 —.2176 4441 
VIII 3792 1681 0281 —.1172 -7578 —.0465 -5760 3450 
TABLE 7 
Correlation of Reference Vectors C = A’,, A,, 
A B Cc D E F G H 
A 1.0002 2184 —.1285 0475 1654 —.1862 2488 2512 
B 1.0001 —.0224 —.3004 —.0986 —.0986  .20384  .1675 
Cc 1.6001 —.0899 0053 0018 1445 —.0691 
D 1.0000 .2649 —.0169 —.0528 .1004 
E 1.0002 1772 0863 .2309 
F 1.0003 —.1067 —.2729 
G 1.0001 1640 
H 1.0001 
TABLE 8 
Inverse of Table 19 A,,-1 
39345 —.64592 64836 —.30663 .25750 —.11316 —.23786 09304 
-76092 -74226 .06889 —.09795 —.20805 07771 19246 01151 
07928 —.03955 59154 -67700 24757 20187 38258 —.06070 
36598 48128 —.05243 384510 51207 —.25445 —.53447 —.32149 
-17788 —.03632 —-.45393 11528 -08551 56551 03214 820381 
—.41748 40693 66844 —.37846 .09917 33273 —.36333 —.11339 
—.35451 -20510 .05787 29472 —.46543 —.51307 —.40483 49274 
—.54045 30714 17835 —.34943 57486 —.35710 -51147 04662 
TABLE 9 
Diagonal Matrix D = 1/Vbp, 
9013881 


9013881 
.9675858 
9116601 
8932559 
-9146620 
9347541 
8901548 
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TABLE 10 
Primary Factors T = DA,,-} 

.3547 —.5822 5844 —.2764 —.2321 —.1020 —.2144 0839 
-6859 6691 .0621 —.08838 —.1875 .0700 1735 .0104 
.0767 —.0383 5724 6551 .2395 1953 38702 —.0587 
38336 4388 —.0478 8146 .4668 —.2320 —.4873 —.2931 
.1589 —0324 —.4055 1080 .0764 5051 .0287 -7327 
—.3819 8722 6114 —.3462 .0907 3043 —.3323 —.1037 

—.8314 1917 .0541 .2755 —.4851 —.4796 —.3784 .4606 
—.4811 2734 1588 —.3110 .5117 —.3179 .4553 -0415 

TABLE 11 
Correlation of Primary Factors TT’ 

Ty T, T, Ta T. T; T, T, 
T, 1.0000 —.1725 1548 —.0401 —.1687 1584 —.2060 —.0940 
T, 1.0001 0371 .8003 .0867 .0012 —.1829 —.1485 
? 1.0001 0908 —.0666 0488 —.1863 .0664 
T. 1.0002 —.2197 0618 0152 —.0674 
7. 1.0000 —.2812 —.0013 —.2597 
T; 1.0000 0282 -2843 
ae 1.0001 —.0886 
1.0001 
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A DEVICE FOR FACILITATING THE COMPUTATION OF 
THE FIRST FOUR MOMENTS ABOUT THE MEAN* 


JOHN LYMAN 
UNIVERSITY OF CALIFORNIA AT LOS ANGELES 
AND 
PIETRO V. MARCHETTI 
UNIVERSITY OF ILLINOIS 


_ A mechanical device is described which materially reduces the 
time, labor, and probability of errors which are involved in the com- 
putation of certain moments about the mean. The use of the device 
is illustrated with the computation of the mean, standard deviation, 
9, as a measure of skewness, and g, as a measure of kurtosis of data 


from a given study. 


The labor involved in the computation of the first four moments 
about the mean in statistical analyses of data often discourages the 
use of moments higher than those necessary for obtaining the mean 
and standard deviation of any distribution. The limitation of these 
two lower moments is that they do not adequately describe the shape 
of the distribution, so that where circumstances require it an appro- 
priate mathematical transformation can be applied to the obtained 
distribution to meet the assumption of “normal” distribution neces- 
sary for the application of many statistical formulas. Accordingly, 
in order to materially reduce the time and labor required as well as 
the probability of errors in the work, the authors have constructed 
the device described herein for aiding in the computation of the first 
four moments about the mean. After a description of the device, 
some data from an unpublished study by Dr. Joseph A. Gengerelli of 
the University of California at Los Angeles will be treated statis- 
tically in order to illustrate possible advantages in the use of this 
device for computing certain statistics. 


Description of the Device 


The apparatus illustrated in Figure 1 consists of a plywood box, 
A, in which are placed 26 endless belt tapes, B , which may be ad- 
justed by turning the wooden dowels, C . The dowels have a diameter 
*Grateful acknowledgment is made to the Holter Research Foundation of 


Helena, Montana for the grant-in-aid which made possible the development of 
this device. 
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FIGURE 1 


of 1/2 inch and serve as rollers over which the endless belt tapes pass. 
The tension on the tapes is maintained by adjusting the movable 
strips, E , which support the bottom row of dowels. To prevent slip- 
page of the tapes over the dowels, the upper dowels are covered with 
rubber laboratory tubing, F. The surface of the rubber tubing has 
been abraded with sand paper to improve the traction between it and 
the paper tape. The cover of the device consists of a piece of trans- 
parent vinylite plastic 1/16 inch in thickness. The windows, for re- 
stricting the view of the interior of the mechanism to the numbers 
turned up on the tapes, were made by placing a strip of masking tape 
on the vinylite cover over each endless belt tape. The uncovered por- 
tion of the plastic cover was sprayed with paint. After the paint had 
dried, the masking tape was removed. This left “windows” in the 
plastic cover through which one reads the numerical values on the 
endless belt tapes. 

The numerical material was typed on the tapes in elite type. A 
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TABLE 1 
O’s Crossed Out by 7th-Grade Students in Five Minutes 
Score x f fu fu fx’ fat f(a+1)4 
398 16 1 16 256 4096 65536 83521 
380 15 3 45 675 10125 151875 196608 
367 14 1 14 196 2744 38416 50625 
354 13 7 91 1183 15379 199927 268912 
841 12 3 36 432 5184 62208 85683 
328 i 2 22 242 2662 29282 41472 
315 10 5 50 500 5000 50000 73205 
302 9 10 90 810 7290 65610 100000 
289 8 13 164 832 6656 53248 85293 
276 7 26 182 1274 8918 62426 106496 
263 6 26 156 936 5616 33696 62426 
250 5 44 220 1106 5500 27500 57024 
237 4 80 320 1280 5120 20480 50000 
224 3 95 285 855 2565 7695 24320 
211 z 183 366 732 1464 2928 14823 
198 1 193 193 193 193 193 3088 
185 0 253 0 0 0 0 253 
172 —1 162 —162 162 — 162 162 0 
159 —2 162 —324 648 —1296 2592 162 
146 et 119 —357 1071 —3213 9639 1904 
133 —4 57 —228 912 —3648 14592 4617 
120 —5 38 —190 950 —4750 23750 9728 
107 ane) 13 — 78 468 —2808 16848 8125 
94 —T 4 — 28 196 —13872 9604 5184 
81 —8s 1 — 8 64 — 512 4096 2401 
68 —9 1 — 9 81 — 729 6561 4096 
Totals 1502 806 16048 70022 958864 1339966 
Zfx 806 =fa2 16048 
Vv, = —— — —— 4 Vv. = —— = —— = 10.68 
N 1502 N 1502 
=fx3 70022 Zfu 958864 
= =—- = 46.62 v,=—_ = = 638.39 
N 1502 N 1502 


M=GM + tw, =191 + 13(.54) = 198.02 
o= V#?(v, — v,2) = V169 (10.68 — .29) = 41.78 
43 (Vv, — 8,0, + 2,3) 

















= Vi,= = 
ws 2197[ 46.62 — 3(10.68 - .54) + 2 (.54)3]_ 87 
(41.78) 8 
9I2—= (8, —3) = aes us i? —3 
i 28561[638.39 — 4 (46.64 - 54) + 6(10.68 [.54]2) — 3(.54)4] 3 





3047011.64 
= 5.21 —3— 2.21 
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good grade of paper was used in preparing the tapes, each of which 
is 8 1/2 inches by 36 inches in size. The numerical values typed on 
each tape consist of the frequency, f , (from 0 to 200, inclusively, for 
the class interval, x , of the given tape), fx , fz’, fa’, fat,and f{x+1)‘, 
The last-mentioned value is given as an aid in the computation of 
Charlier’s check (1, 64). Tapes for 26 class intervals, 0 to 25, in- 
clusively, were prepared. After the tapes were typed, the computa- 
tions involved in the preparation of the tapes were checked by being 
recomputed a second time. This device makes possible the analysis 
of data which have been divided into as many as 51 intervals, with 
the frequency at each interval not greater than 200. The one excep- 
tion to this limit of the frequency of the intervals is that of the fre- 
quency of the interval in which the assumed mean lies. In this case 
the frequency may be greater than 200, since all other values on the 
tape for this interval become zero with the exception of the value 
f(x+1)*, which is always equal to f as x equals zero. 


Procedure for the Use of the Device 


In order to illustrate the procedure for the use of this device, the 
mean, standard deviation, a measure of skewness, and a measure of 
kurtosis are computed in Table 1 for the data obtained by Gengerelli 
from 1502 seventh-grade school children. 

The steps in performing these computations are the following: 

1. The interval in which the guessed or assumed mean lies is 
selected. The tape at this interval, x equals 0, is turned until one sees 
through the window of the f-column of this tape the numerical value 
of the frequency of the interval of the guessed mean. This same step 
is repeated with each interval above the interval, x equals 0. Selec- 
tion of the appropriate value of f on each tape automatically brings 
into view the values fz , fx*, fx*, fat, and f(~+1)** for the given val- 
ues of f and x. There is visible, thus, through the windows in the 
plastic cover of this device the data which appear within the rec- 
tangle drawn on Table 1. It will be noted that the interval in the dis- 
tribution which had a frequency greater than 200 was chosen as the 
interval in which the mean is assumed to lie. If for any reason this 
could not be done or had there been more than one interval with a fre- 


We shall not discuss the application of the column f(x+1)4 here, as it is 
particularly useful only when the summations of the other columns have been 
made on calculating machines without recording tapes; that is, in those cases 
where a cumulative record of the computations of these various functions of 
f and zx is not available for checking, manipulation of =f(x+1)* can serve as 
a check on the computations of the aforementioned functions of f and x (1, p. 
64). In this procedure, we are assuming that an electric or hand-operated adding 
machine with a recording tape is used. 
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quency greater than 200, it would have been necessary to distribute 
the raw data throughout a larger number of intervals in order that 
no interval would have a frequency greater than 200, or at least no 
more than one interval. 


2. Compute the following sums: Sf, Sfz, dSf2*, dSf2*, and 
Sfa*. Affix a plus sign to the totals. 


3. Check the adding machine tapes against the figures as they 
appear on the computing aid. _ 


4. Repeat Step 1, this time turning the frequency at each class 
interval below x = 0 into the apparatus. 


5. Repeat Step 2, affixing a minus sign to >fx and Sfz*, and 
a plus sign to Sf , Sfa?, and Sfa*. 


6. Check the adding machine tapes as in Step 3. 
7. Combine algebraically the sums obtained in Steps 2 and 5. 


8. We are now ready to compute the first four moments (1, 
V2, V3, V4) about the assumed mean. Compute these according to the 
following formulas (1, 60): 








Sfu 
Vv, = — 
N 
_ dfx? 
V2 
N 
7 where N = Sf 
— Sfa 
N 
afax* 
Vv,-—. 
N 


9. Apply the values obtained in Step 8 to the following formu- 
las (2, 27) to obtain the mean, M , the standard deviation, o , a meas- 


ure of skewness, \/f,, and a measure of kurtosis, 6. — 3. 


M=GM + iv,, where GM is the guessed mean and 
i is the size of the class interval; 
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os Vi? [v- oie (¥;)?] ; 
[v3 — BV2V1 ay 2 (Vx) 4 


VA = 
Bx “ ’ 


74 [v, 7 AV, =F 6v. (v1)? aa. (v,)*] 





3. 





b2— 3 => 
a* 


Both \/f; and f. — 3 will equal zero when the distribution is normal. 
The distribution is skewed positively or negatively to the extent that 
\/p; deviates from zero, is flat-topped (platykurtic) to the extent that 
f. — 3 is less than zero, and is steep or peaked (leptokurtic) to the 
extent that £. — 3 is greater than zero. 


Evaluation of Apparatus 

Some of the advantages of the technique presented herein ap- 
pear to be as follows: 

1. It was found that the problem illustrated above could be 
computed in approximately 35 minutes from the raw frequency dis- 
tribution by an experienced operator as compared with approximate- 
ly 90 minutes when conventional methods (i.e., an electric calcula- 
tor) were used, a saving of 55 minutes or about 60%. 


2. No transcription of numbers other than final sums is re- 
quired, reducing time consumed as well as the probability of error. 


3. There are no hidden errors which must be hunted for, fre- 





FIGURE 2 
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quently requiring a complete recomputation to discover, as when an 
electric calculating machine is used exclusively. All the data are 
before the operator. 


4, A calculating machine and auxiliary tables are not required. 
An adding machine used in conjunction with a slide rule for the final 
computations is sufficient. 


5. An inexperienced operator can do accurate work efficiently. 


In use it has been found that the large size of the apparatus in 
its present form does not present a handicap. However, the writers 
believe that a more compact piece of equipment would be desirable 
and accordingly propose to construct a model of the type illustrated 
in Figure 2. 
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A NOTE ON THE CALCULATION OF WEIGHTS FOR 
MAXIMUM BATTERY RELIABILITY 


BERT F. GREEN, JR. 
EDUCATIONAL TESTING SERVICE 


_ Weights may be determined for combining tests so that the com- 
nego has maximum reliability, but the calculation of these weights 
y means of the original equations is cumbersome. It is shown that 
the desired weighted composite is the first principal axis of a matrix 
closely related to the intercorrelation matrix. Thus, simple and 
straightforward procedures are available for calculating the weights. 


A possible weighted composite of scores on a test battery, when 
no external criterion is available, is the composite which has maxi- 
mum reliability. E. A. Peel (5) has shown that the weights for such 
a composite are given by 


(C—A~R)W=0, (1) 


where W is a column vector of weights, C is the matrix of test in- 
tercorrelations with reliabilities in the diagonal, R is the same matrix 
but with unity in the diagonal, and 4 is the battery reliability given 
by the weights W. The value of 4 is the largest root of the determi- 
nantal equation 


\C—aR|=0. (2) 


The solution of this equation can be facilitated by a simple trans- 
formation which allows the use of Hotelling’s (3) iterative method 
for finding the largest principal axis of a variance-covariance matrix. 


Let 7i;; = correlation of tests i andj, and 7;; = reliability of testi. 
The off-diagonal elements of (2) are 


rij (1—A), (3) 
while the diagonal elements are 
T is A 
1—ri, 1-4 





ru —a=| |-a—rya—a. (4) 


Thus (2) becomes 
57 
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|C —AR|= (1—A)* TT (1— 7x) |DCD — BI|=0, (5) 
where v is the number of tests, or 
|\DCD— 61|=0 (if r;; #1 for all%), (6) 
1 
where D is a diagonal matrix with elements —————, I is the iden- 
vil Ts 


' : A 
tity matrix, and 6 = 7° a scalar. (If some 7;; = 1, non-zero 


weights for at least one of these and zero weights for all tests with 
non-perfect reliability will yield perfect battery reliability). (A < 1 
unless some 7;; = 1). 
A 
Since 6 = ——, 1 = ———, the largest value of # will be equiva- 
1—,j 1+, 

lent to the largest value of /, so the first principal axis of DCD is de- 
sired, say V, a column vector. 

Finally, 


(DCD — 8 I) V = 0 is equivalent to the n equations 
n Vij 
= ¥3 —— — i V1—7ru B=0, 1=1,---,n. 
ja VI 5; 

(C —4A R)W =0 is equivalent to the n equations 


> wv; Ti —Wil—riu) B=0, ¢=1,---,%. 
j=1 
Therefore 
V% 
w;, — k ————_-_, or W=kDV (7) 
V1— Tri 


(k is a constant of proportionality). 





Lawley’s (4) factoring procedure may also be used. If 1 — rii 
is treated as specific variance and loadings on the first factor are com- 
puted by Lawley’s iterative technique, the row vector of these load- 
ings, L , is related to V and W by 


V=DL W=kD*L. (8) 


Aitken (1) has discussed the general problem of solving 
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(A—AB) W=0, (9) 


where A and B are square matrices. 
He shows that this is equivalent to 


(B7A—11) W=0, (10) 


and thus W is the first latent column vector of B-*A . In the case dis- 
cussed in this note, we may write 


(C—aR)W=0 R=C+D" 
[C(1—a4) —4 D2] W=0 
(D°C — pI) W=0. (11) 


Aitken’s procedure may then be used on D°C . 

The methods of equations (6), (8), and (11) seem simpler and 
swifter than the successive approximation method suggested by Peel, 
in which the determinant must be calculated for various trial values 
of 4. When the number of tests becomes large, the iterative proce- 
dures are especially valuable. Even with four tests, the author be- 
lieves the iterative procedures to be superior. 

The interpretation of these weights is interesting. Let 

Xs; Vi i 
a so = ’ 
Gi aiV1l—T7 V1I-Ti 
where o; is the standard deviation of the scores 7;. Then DCD is the 
variance-covariance matrix of the variables y;, which we may call 
“relative accuracy” scores, since they are weighted inversely as their 
standard errors of measurement. These variables y; have been pro- 
posed on a priori grounds by Gulliksen (2). If we define a set of 








weights U by u;= a, it is clear that 


Cj 
U are the weights for deviation (or raw) scores 2%; , 
W are the weights for standard scores 2; , 
V are the weights for relative accuracy scores ¥; . 


Furthermore it is evident that these are all equivalent to weight- 
ing so that the composite is the first principal axis of DCD. 

To illustrate the present method, an example given by Peel has 
been reworked using the Hotelling method, equation (6). A similar 
procedure can be used for the Aitken method, equation (11). The 


matrix C is 
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|| .770 518 .390 .288 
! .518 723 72 .109 
| .390 372 .875 .402 
\ 
The entries in D are 

2.085 1.900 2.828 2.477 


The iteration is performed on the matrix DCD 


288 109 402 837 


= 
3.347 2.052 2.300 1.487 || 9.186 
2.052 2.610 1999 0.518 | 7.174 
| 2.300 1999 6998 2.816 | 14.118 
| 1.487 0.518 2.816 5.136 || 9.952 








The column vectors have been written as rows for convenience. 
After each iteration, the vector has been normalized. The initial 
weights are proportional to the column DCD having the largest sum. 
(S is the normalizing factor, the square root of the sum of squares 
for the vector.) 


S 
V, = .283 .246 861 346 
DCDV, 3.947 3.121 8.142 4,749 10.685 
V.  .369 292 .762 444 


Ve .3943 3004 -7246 A787 
DCDV, 4.3145 3.2872 7.9262 5.2395 10.9406 

V, .3943 3005 -7245 4789 

DV, 8221 5710 2.0489 1.1862 


1 
kdV,=W _ .A0 28 1.00 58 ( k= ) 
2.0489 





The normalizing factor converges to £; thus by 4 = Try A= .9163. 
1+ 


From p = oe , which is the battery reliability with weights 
W , p = .9163, which agrees with the above value of 4 since 4 and p 
are theoretically equal. Peel computes the weights 
40 .28 1.00 59 
The difference between .58 and .59 is negligible since, using Peel’s 
weights, p = .9164. 
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A NOTE ON THE MEASUREMENT OF REVERSALS 
OF PERSPECTIVE 


J. S. BRUNER, L. POSTMAN, AND F. MOSTELLER 
HARVARD UNIVERSITY 


The contradictory findings in the classical literature on the 
rate of reversal of perspective illusions stem, at least partly, from a 
failure to isolate three major sources of variance: set, subjects, and 
sequence. This paper provides a model for the statistical isolation 
of these variables. It consists of the square-root transformation of 
reversible data which appear to constitute a Poisson distribution. 
Such a transformation permits the full utilization of the analysis 
of variance. 


At first glance, the illusion of reversible perspective appears to 
be an innocent enough example of that class of fallible phenomena 
which many of us find indispensible in demonstrating the frailty of 
perception. But unlike many other striking illusions, reversible per- 
spective has had a long, noteworthy, and hectic history of investi- 
gation. For reasons not always clear to one who reads the literature 
on the subject, the reversible figures have been considered a crucial 
signpost on the road to understanding the central-most mechanisms 
controlling perception. There are studies which use rate of reversal 
as a measure of the role in perception of attention (1), fatigue (2), 
group influence (3), oscillation (4), and last but far from least, in- 
troversion and extraversion (5). 

Unfortunately, the dozens of studies in this complicated field 
have frequently contradicted each other. It is for this reason and be- 
cause we feel that the research opportunities provided by this tanta- 
lizing phenomenon have not been fully exploited that we offer a few 
statistical and psychological observations relating to its measurement. 

Many of the studies already published have suffered from a lack 
of agreement in methods of collecting and treating data. Perhaps re- 
search here, as in many other areas, might benefit from a simple and 
minimal paradigm. We should like to propose that whatever corre- 
late of reversibility one is investigating, there are at least three ubiq- 
uitous sources of variation which must be taken into account: set, 
subjects, and sequence. 

Take the Schroeder staircase as an example. The subject’s set, 
usually induced by instruction, may be aimed at maximum reversal 
of perspective. Instructions may, on the other hand, set the subject 
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to inhibit reversal. Or the subject may be set to favor one phase of 
perspective or the other. Finally, instructions may be designed to 
foster passive contemplation, or a “natural” attitude under which one 
can ascertain the subject’s natural rhythm. Doubtless, as Fluegel has 
shown, there are many other sets that may be induced. There now 
exists ample experimental evidence to show that set is a highly sig- 
nificant determinant of rate of reversal. 

As for sequence, or the changes which occur as an experimental 
session progresses, there is again more than enough evidence to show 
that temporal sequence makes a difference — although the nature of 
the difference is still a debating point. Continued exercise has been 
found to speed up reversals in some experiments, slow it down in oth- 
ers. It may very well be, however, that the time factor influences 
reversals under different sets in diverse ways. 

Finally, subjects. Whoever works with reversible perspectives 
is all too aware of the large individual differences which occur both 
in rate of reversal and in susceptibility to various experimental treat- 
ments. At least some of the inconsistencies by which the reversal 
literature is plagued can be attributed to a rather indiscriminate pool- 
ing of data obtained from widely differing subjects. The solution of 
the problem is, of course, to increase the precision of estimating the 
significance of different factors by taking systematic account of the 
variance contributed by individual differences. 

A first conclusion to our description of the problem is to analyze 
the data of reversibility by means of the analysis of variance, allow- 
ing us to take systematic account of variance introduced singly and 
in interaction by set, sequence, and subjects. 

We have attempted to carry out our own suggestions by collect- 
ing a large number of observations of the Schroeder staircase, vary- 
ing systematically the three factors mentioned above. Nineteen sub- 
jects, normal adult men, were given the task of reversing the stair- 
case for ten minutes under each of three sets of instructions. (1) 
Alternate instruction: to reverse the figure as rapidly as possible; 
(2) Hold instruction: to keep the figure from reversing; and (3) 
Natural-rhythm instruction: to allow the figure to reverse at its own 
natural rate without any effort to reverse it or hold it. The three in- 
structions were balanced with respect to order. The data were then 
arranged as in the matrix shown in Table 1. 

There are instances when it is inadvisable to subject raw data to 
an analysis of variance. One must conform to the assumptions on 
which the technique is based. Two of the assumptions which concern 
us here are normality of distribution and homogeneity of variance 
throughout the matrix. Inspection of the raw data early suggested 
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TABLE 1 
Reversals per Successive One-Minute Interva] of the Schroeder 
Staircase Under Three Instructions 
Subjects 


1 23 4 5 6 7 8 9 1011 12 138 14 1516 17 18 19 


“Alternate” 190 22 61 31 60 54 70 66 29 89 24 30 24 70 56 30 163 42 30 
(= 47.4) 107 23 56 29 55 39 56 76 21 98 31 22 19 64 54 33 191 54 28 
70 14 46 80 46 47 58 76 20 95 34 14 22 58 36 32 154 44 20 
62 12 47 32 46 44 46 64 14 78 55 22 23 48 33 22 150 44 17 
98 27 88 38 39 36 45 42 18 92 46 24 22 54 25 35 156 50 17 
98 14 34 24 41 30 45 48 19 26 38 32 21 43 35 35 146 38 22 
64 12 35 15 38 36 46 78 17 203 34 18 26 45 21 30 132 32 24 
37 8 38 29 36 41 38 54 16 118 22 48 24 38 38 36 148 34 22 
79 8 34 18 37 30 38 64 16 92 30 52 20 42 33 40 129 29 22 
129 11 30 32 35 28 42 67 12 86 27 53 18 43 40 32 131 35 32 
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“Natural” 54 20 2 28 42 43 30 5014 67 41117 18 3018 11 18 
(7216) 4515 8 28 54 38 22 6617 64 8 27 24 18 26 16 11 36 

40 14 14 18 48 39 27 58 19 62 15 23 22 11 34.19 16 39 
45 15 10 27 42 26 17 62 19 49 15 24 23 14 29 18 10 32 
831 18 18 18 35 22 22 4819 63 10 20 25 1413 16 11 15 
2816 71432241751 8 40 41820 91515 9183 
16 16 18 21 27 16 22 48 11 48 419241126 8 8 31 
22 6 12 20 82 30 16 33 18 22 8 19 21 10 24 13 16 16 
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that these assumptions were not met; indeed, that they bade fair to 
approximate a Poisson distribution. 

Is the Poisson model a reasonable one for us? Phenomena such 
as the number of counts on a Geiger counter, in fixed time intervals, 
the number of wireworms per plot of ground, and the number of 
phone calls per interval of time, are well known to be approximately 
distributed according to the Poisson distribution function. The num- 
ber of alternations in a one-minute interval might reasonably be ex- 
pected to follow this same distribution. We assume that the alter- 
nations are independent of one another and that a large number of 
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alternations is possible in any given minute. The Poisson distribu- 
tion has only one parameter, m, the mean number of alternations, 
and the variance is also m. Consequently, the larger the mean the 
larger the variability (6). 

There is a natural way to analyze data from a Poisson distribu- 
tion. One analyzes the square roots of the observations (\/x) or the 
square root of the observations plus .5 when the mean value is small 


(Vz + .5). This has two important effects. 

(1) Although each individual under each condition may have 
a different true mean number of alternations, m; , taking the square 
root of the observations adjusts their variances in such a way that 
the variances no longer depend on the unknown m;. Indeed, the vari- 
ance of any set of Poisson data after this adjustment ideally approxi- 
mates .25. Thus we know what the variance of sets of individual ob- 
servations should be if the Poisson applies, and we have equality of 
variance for every condition (a beautiful property under any circum- 
stances, especially for regression and analysis of variance). The 
mathematical derivation of such transformations as the one being 
discussed are especially designed to make the variance of sets of ob- 
servations independent of unknown parameters such as true mean, 
true variance, or true population proportion (see Appendix). Obvi- 
ously, then, the adjusted data should have a variance in which m; 
does not appear. For a brief discussion see Snedecor (7) and for a 
more extended account with references see Bartlett (8). 

(2) It also happens that the square-root transformation ad- 
justs Poisson data so that they are very nearly normal. 

An analysis of variance on the transformed data meets the cri- 
terion of equal variance in each cell, while the original data are very 
far from meeting this criterion. Naturally, we cannot expect that 
any data will be distributed exactly according to a theoretical model. 
A preliminary graphical analysis indicated that the present data were 
in good agreement with the Poisson model. 

There are limitations on the use of the transformation, square 
root of x + .5. When the mean of the Poisson distribution gets small, 
the variance is reduced. The following table shows how rapidly the 
variance of square root of x + .5 tends to .25. 


mean m o?(\/x + .5) 
1 .1603 
Zz .2135 
3 .2323 
5 .2428 


10 .2468 
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For most purposes we will be satisfied that the variance is nearly 
constant if m is as big as 3. We have only one subject — No. 4 under 
“Hold” instructions — who comes close to having a mean this small. 

An analysis of variance was made of the transformed data. We 
used the square root of x + .5, and carried the square roots to three 
decimal places. We did the analysis using the square rooted data en- 
tirely, merely using the original table to check sums of squares. There 
is a temptation to use the sums from the original data plus appro- 
priate multiples of .5 as sums of squares. But we preferred to work 
entirely with the square roots. 

The variables were Subjects, Set, and Time-Sequence in each 
ten-minute test period. The analysis was carried through using Sub- 
jects and Set as a two-way classification. There was a clear-cut re- 
gression in Time-Sequence, the subjects tending to have smaller mean 
numbers of alternations as each testing period drew to a close. This 
effect may be ascribed to the onset of fatigue, as previously noted by 
Ash (9). We used the standard device for taking out this regression 
effect. It involves assigning one degree of freedom for each subject 
under each instruction — consequently 57 degrees of freedom are as- 
signed to Regression or Time-Sequence. 

It may be worth while to indicate how this is done for one sub- 
ject. Since our minutes go in equal steps it seems reasonable to as- 
sign equal steps to ¢ (time). Considerable trouble can be saved by 
arranging the ¢t-values to be integers and by having the average val- 
ue of t be zero. Since we are free to vary only the slope of the re- 
gression line, we are trying to minimize 


=(y —a— bt)? 


with respect to b. Differentiating, we get as our estimate of b the 
quantity Sty/St? because St = 0. Then the sum of squares taken 
out by the assignment of b is 


=(y—a)?—S(y—a— bt)? = (Sty)?/d#. 


To illustrate with Subject 1 on “Alternate,” we have 
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Minute t y 

1 9 10.977 
2 7 10.368 dt? = 330 
3 5 8.396 DSty= 17.522 
4 3 7.906 (Sty)? = 307.020 
5 1 9.925 (Sty)?/S? = .930 
6 —! 9.925 
7 —3 8.031 
8 —5 6.124 
9 —7 8.916 

10 —9 11.380 


The following table describes the results of the aforementioned 
analysis of variance. 


TABLE 2 
Analysis of Variance 


Sumof Mean 


Source d.f. Squares Square F-ratio 6 
Subjects 18 707.739 39.82 214 Pr 
Set 2 1036.330 518.17 36.5 P<< 01 
Interaction (Su * St) 36 511.529 14.21 80.5 Pew 
Time-Sequence regression 57 117.101 2.054 4.41 PP << 
= 

Residual sampling variance 456 212.403 0.4658 849.61 P< 
Total 569 2585.102 


In Table 2 we compared Subjects and Set with the Interaction 
in computing F-ratio, because these main effects contain, in addition 
to the sampling variance, multiples of the interaction variance, which 
has in turn already proved significant when compared with the resid- 
ual mean square. The regression mean square is compared with the 
residual mean square. We were interested in comparing the residual 
mean square with the theoretical value .25. This could be done by 
computing F with 456 and infinity d.f., or it could be done by com- 
puting chi-square. We have used chi-square. We do not of course an- 
ticipate that the value of .25 will really be achieved by the residual 
mean square because it contains such things as higher-order inter- 
actions and any quadratic effects in the regression and is affected by 
all the little errors that experiments fall prey to. What we are really 
interested in is whether the residual mean square is of the same order 
of magnitude as .25. We feel that the result’s being less than twice 
the theoretical value represents an extremely respectable state of 
agreement between the theoretical assumption of the Poisson distri- 
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bution and the empirical situation. A little later we will give a sec- 
ond analysis of these data. 

If we test Subjects and Set against their Interaction, we find 
both are significant beyond the .01 level. The effect of Set is particu- 
larly impressive. The Interaction and the Regression are highly sig- 
nificant when compared with the Residual Sampling Variance. Ideal- 
ly, as we have indicated above, it would be .25. Naturally it is sig- 
nificant at the .01 level. In other words, non-chance factors are oper- 
ating to increase the variance. As usual in experimental work, there 
are too many observations too far out on the tails of the distribu- 
tions — a contaminating effect. We shall adjust these to see the mag- 
nitude of their effect in our second analysis. 

A word should be said in passing about the allocation of variance 
attributed to the three sets: alternation, natural rhythm, and hold. 
If we partition both the Instruction sum of squares and the Inter- 
action sum of squares into two parts, one for alternation and one for 
natural rhythm and hold taken together, we find that alternation is 
assigned about 85 per cent to 15 per cent for the other two instruc- 
tions. We do not understand what intrinsic characteristics of the 
alternation process led to this disproportionate variability. It is a 
problem for further study. 

Finally, the significance of the Interaction effect should be under- 
lined. The variability of subjects under different instructions indi- 
cates to us the fruitfulness of examining in more detail the differen- 
tial responses of the individual subjects to different instructions. 

We have noted earlier that there were some anomalous observa- 
tions. To observe the magnitude of their effect, especially on the re- 
sidual mean square, we carried the same analysis through again with 
the four italicized observations adjusted. All four observations occur 
in the “Alternate” set. We changed Subject 1’s 37 to the average of 
its adjacent observations, 72; we changed Subject 10’s two extreme 
adjacent observations 26 and 203 to their average 115; and we 
changed Subject 17’s 191 to 163, the next largest observation. We do 


TABLE 3 
Analysis of Variance with Four Observations Adjusted 

Source d.f. Sum of Squares Mean Square F-ratio 

Subjects 18 721.361 40.08 2.81 

Set 2 1048.573 524.29 36.8 

Interaction 36 512.546 14,24 39.8 

Time-Sequence regression 57 114.574 2.010 5.61 
x2c= 

Residual sampling variance 456 163.260 8580 653.04 


Total 569 2560.314 
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not care to argue that these are particularly good adjustments to 
make; we merely want to use such an adjustment to discover roughly 
the magnitude of the effect of these four observations (2/3 of one per 
cent of all observations) on the residual mean square. The results 
are given in Table 3. 

In the adjusted Table 3, we find essentially no change in the mean 
squares except for the residual. Here the adjustment of four obser- 
vations has reduced the mean square by 23 per cent. All of the con- 
clusions remain unchanged. However, when a reasonable change of 
four out of 570 observations can reduce the residual mean square by 
nearly a quarter, the reader can see why we feel that our previous 
agreement with the theoretical value of .25 is relatively good, in spite 
of the probability level associated with the chi-square value. 

By the use of the analysis of variance on properly transformed 
data, it should be possible to estimate ever more carefully, singly and 
in interaction, the contribution of the many important psychological 
variables which have been investigated by the use of reversible fig- 
ures for at least half a century. 


APPENDIX 
We have been asked to provide a rationale for the square root 
transformation for the Poisson. The following approach seems ade- 
quate to show that the variance of \/z is approximately .25 for large 
means. 





We need 
_ 2 \/x m? 
E(Vz)=e"> 
ame | 
ae (1) 
ayn + tn" 
=o" >; -—_—_--——; 
i=—m x! 
7 a? a Bit 
2m 8m? 16m? 128m‘ 


where i= x—m. Here we are tacitly assuming m to be an integer, 
although it need not be; we do this for simplicity of presentation. 

Now the first five moments about the mean for the Poisson are 
known to be 
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fi = 0 Me 38m? +m 
M2 =m Ms =10m? +m. 
fs = ™ 


We can take expected values by replacing 7‘ by u;, and we get 





= a 1 rf 
E(Vz)=Vm a 0(1/m’) | (3) 
’ - r 1 3 
IE (Vx) = ae 1 + 0(1/m*) | ‘ (4) 
Wt Oa a, 


Then 


o?(/x) = E(x) — [E(V2)]? 
(5) 


1 2 
=— + 
4 32m 





+0(1/m?). 


A similar derivation for \/x + .5 gives 








iL m 38m? 
oc /e+ Se ao + 0(1/m? 
Veta ot ae oe 
1 1 3m? (6) 
=-| ws sjeupeen | +oa/m), 
4 2m'+1 (2m+ 1)? 
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BOOK REVIEWS 


S. 8S. WILKS. Elementary Statistical Analysis. Princeton: Princeton University 
Press, 1948. Pp. xi + 284. 


Readers of this review may be familiar with a more advanced book by the 
same author: Mathematical Statistics. The present book is at quite a different 
level. The preface says that the book is designed as an introductory course for 
all fields of statistical application and especially designed for those who intend 
to go into the biological and social sciences. It presupposes one semester of ele- 
mentary mathematical analysis covering topics such as those included in the 
first half of F. L. Griffin’s Introduction to Mathematical Analysis. The amount 
of calculus used will be indicated below. 

The format of the present book is not unlike that of Mathematical Statis- 
tics. It is paper-bound, 7” x 10”, photo-offset from a typed manuscript. The 
manuscript has been reduced so that the actual type is rather small, but there 
is plenty of white space between the lines. The figures are well drawn and there 
are few typographical errors. 

The introduction opens boldly with the concepts of samples and populations. 
Some 14 examples are described, and in five of these ways are illustrated for 
using the data to get valuable inferences. In this same chapter graphical meth- 
ods are briefly discussed. 

Chapter 2 treats frequency distributions with special emphasis on the cu- 
mulative frequency distribution. The first formulas appear on page 30 toward 
the end of the second chapter. Chapter 8 concerns computation of sample means 
and standard deviations. The variance uses » — 1 in the denominator from the 
beginning. Chapter 4 is a more extensive discussion of elementary probability 
(combinations, permutations, Euler diagrams, and mathematical expectation) 
than is found in most statistics texts. There is a brief section on geometrical 
probability (which could easily be omitted). 

Chapter 5 considers probability distributions, the uniform distribution, the 
cumulative distribution, and probability density functions, and distinguishes be- 
tween discrete and continuous distributions. 

Chapter 6 considers the binomial distribution, Chapter 7 the Poisson, Chap- 
ter 8 the normal. 

In Chapter 9 we return to the problem of sampling, and the distinction is 
made between theoretical sampling and actual sampling. It is indicated how to 
compute the mean and variance of a finite population, and an easy derivation 
is given for the mean and variance of the binomial. A more difficult one had 
been given in the chapter on the binomial. In the same chapter sampling dis- 
tributions of sums and differences of sample means are discussed. 

Chapter 10 considers the problem of confidence limits of population para- 
meters, applying the method of the binomial, means of finite populations, and the 
norma! distribution, as well as differences between means for both the normal 
and the binomial populations. It also introduces the t-distribution. The confi- 
dence limit approach is used as a springboard for the six-page Chapter 11 on 
statistical significance tests. 
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Chapter 12 considers tests of randomness, employing the nonparametric 
method of runs and quality control chart. 

Chapter 13 concerns analysis of pairs of measurements, treating regression 
from the point of view of least squares, but introducing the sampling variability 
of the regression coefficient and a chart for obtaining confidence limits of the 
correlation coefficient for samples of size 3 to 400. 

We now take up some of the unusual features of the book. The summation 


, 


notation is introduced on page 34, and = X; is replaced by S(X). It might be 


t=1 
remarked that the denominator of the variance is usually written on the left- 
hand side as 


1 
(n—1)8s,2=S(X?) — [S(X) ]?. 


It is not clear to the reviewer whether this practice is used deliberately by the 
author to make formulas iook less impressive to the student, or whether this is 
merely a device to save space in the printing. Similarly when the mean is de- 
fined, nX is found on the left of the equation. On page 38 we find a rather un- 
usual explanation of degrees of freedom for an elementary text, where it is in- 
dicated that the sum of the squares of the deviations from ‘the mean can be re- 
duced in the case of a sample of n terms to the sum of n—1 squares. 

The definitions of probability seem to the reviewer to be unusual. Two 
definitions on page 61 are given below—the first is to handle cases where the 
theoretical probability can be deduced by argument, as in the case of coins, while 
the second is for more complex cases where probabilities can be derived only 
from observation. 


Definition I. If an event E can happen in m cases out of a total of n 
possible cases which are all considered by mutual agreement to be equal- 
ly likely, then the probability of the event E is defined to be m/n, or 
more briefly Pr(E) = m/n. 


Definition II. If (a) whenever a series of many trials is made, the 
ratio of the number of times event E occurred to the total number of 
trials is nearly p, and if (b) the ratio is usually nearer to p when long- 
er series of trials are made, then we agree in advance to define the prob- 
ability of EF as p, or more briefly Pr(Z) =p. 


And on page 96 an additional definition is given of geometric probability: 


Definition of Geometric Probability. If an event FE can happen by the oc- 
currence of a point in a region R, within a region R, all points in R be- 
ing considered by mutual agreement to be equally likely, then the prob- 
ability of the event E is defined as C,/C, or Pr(E) = C,/C, where 
C, is the content of R, and C is the content of FR. 


It will be noticed that the first and the third definitions depend on the concept of 
mutual agreement. This seems a fairly useful way of defining probability, but it 
certainly leads to operational difficulties. For example, with respect to Definition 
I, it might be mutually agreed by some people that when two coins are flipped, the 
three cases are: 2 heads, 1 head, and 0 heads, and that these cases are equally 
likely (one of the exercises requests the student to explain what is wrong with 
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this formulation). Definition II based on observational evidence might therefore 
be used to verify or refute the agreement reached in Definition I. The wording of 
(a) in Definition II undoubtedly helps the student with the notion that sample 
values cluster around population values. But if we assume a thousand coin tosses 
to be “many trials” as the author does in the illustrative material, then if we 
ever had the rare fortune to observe 75% heads in even one such sample out of 
many thousands of samples, it could be argued that .75 is not “nearly” 14, and 
therefore that the probability could not be % even though part (b) of the defini- 
tion held. These definitions may go through more revisions by the author, but they 
do indicate that one of our foremost statisticians is grappling with the problem 
of making probability an operational concept in the elementary statistics class. 

The first necessity for the integral calculus seems to occur on page 115, but 
it is necessary only to be able to integrate polynomials and to understand the 
definite integral as the area under a curve. 

In the chapter on the binomial a simple recursion formula is used for gener- 
ating the terms of the binomial and, unusual in an elementary text, a generating 
function is used to get the mean and variance. It is necessary only that the stu- 
dent be able to differentiate polynomials. It is also indicated that it is not essen- 
tial for the student to follow the demonstrations here because of the easier dem- 
onstration given later. 

In the derivation of the Poisson from the binomial, it is necessary to under- 
stand the concept of limits, in particular the notion that 

lim (1—1/n)">e-1, 
u—-00 
Generating functions are again used to get the mean and variance. 

When the normal approximation to the binomial is discussed, the often neg- 
lected correction of 1/2 is carefully treated. 

In the chapter on runs (or randomness), the statement is made that “ ‘Nat- 
urally occurring’ causes or factors which produce very many short runs do not 
occur very often.” However psychologists may feel that in their material this 
problem does arise frequently, in particular with alternations of rats running 
mazes or with people giving too many short runs when they call color sequences 
of cards such as red or black. Nevertheless, a table for use in testing for too 
many runs as well as too few runs is given. 

In the last chapter partial derivatives are needed for least squares, but the 
derivations could be omitted. 2X2 determinants are used, but these can be ex- 
plained in a couple of minutes. In the discussion of fitting more complicated 


curves such as 
Y = Ae*x 


the author proposes the usual device of using least squares on the logarithm of 
the equation, but fails to point out that the fitted curve so obtained is not iden- 


tical with that gotten by minimizing 
=(Y¥ — Aerx)2, 


Allin all, the book is a fairly stiff one for freshmen. It requires quite a little 
elementary mathematical skill, but the freshmen will have come prepared with a 
good mathematics survey course and will perhaps be eager to work on the many 
examples taken from subject-matter fields. The examples given in the text are 
pretty heavily devoted to real data drawn from the literature of chemistry, biol- 
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ogy, physics, industry, and sociology. This is not quite so true of the exercises. 
The reviewer has made the following rough classification of the approximately 
250 exercises for students: 




















Public opinion polling, audience research, consumer research ............ 5% 
Government, economics, sociology, analysis of literary style ............ 4% 
Business, education, everyday life 11% 
Biology (including mortality) 6% 
Physics, chemistry, military 6% 
Industrial production and research 11% 
Balls, chips, thumbtacks, pennies, cards, dice .. 27% 
Mathematical (no subject-matter content) ...14% 
Unclassified 16% 





It might be noted that tossing thumbtacks is a clever device for obtaining a 
biased coin. 

This analysis of exercises suggests that if the course is designed for biolo- 
gists and social scientists, it would be worth the author’s while to increase the 
proportion of exercises drawn from these subject-matter fields. It is rather diffi- 
cult to avoid the balls, chips, etc. category in a good treatment of combinatorial 
problems, but these problems may not take the student so much time as their per- 
centage would indicate. 

This book could be used as a first course in statistics even by students who 
do not have the requisite mathematics, providing the instructor were careful to 
indicate the sections and derivations the student would not need to undertake. 

If, in the next revision, the author reduced the number of mathematical 
and ball and chip problems, trading them in for more examples drawn seriously 
from the social sciences, this book will bid fair to provide an answer to the ever- 
growing number of introductory courses in statistics in subject-matter fields. 
This duplication of effort at the introductory level is beginning to plague many 
universities and colleges. Even as the book stands now, it is easy for a student 
to see that useful inferences can be drawn by the same technique in a wide va- 
riety of subject-matter fields. With the help of a strong instructor, the student 
will get the notion that he is learning statistics to apply anywhere and not just to 
his own problems. Such an integrated course has the added advantage that a 
change of subject-matter fields will not result in the student having to take a new 
elementary course in statistics. 

Psychologists who are teaching elementary courses will be pleased with the 
treatment and the discussion, but will be sorry to see the lack of psychological 
examples. They would do well to have this book on their shelves because in many 
cases the elementary treatments are different and suggestive. Persons teaching 
freshman and sophomore statistics courses as service courses for a university but 
not as mathematical statistics courses should certainly consider this text. 


Harvard University Frederick Mosteller 


TRUMAN LEE KELLEY. Fundamentals of Statistics. Cambridge: Harvard 
University Press, 1947. Pp. xvi + 755. 


The numerous teachers and investigators who for many years have profited 
by the use of Professor Kelley’s earlier Statistical Method (1928) will naturally 
welcome the appearance of this volume. It is not a revision of the former work: 
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it is a new volume. Similarities are apparent, but most of the material and the 
presentation are new. In it are the author’s reactions to the enormous develop- 
ment of mathematical statistics during the past quarter century, his interpreta- 
tion of them, and some of his applications. It is no mean task to digest this ma- 
terial, to see its implications for psychological and educational statistics, and to 
present an organized treatment of it. In many respects the author has succeeded 
in this, in other respects he has fallen short. There has been what many will re- 
gard as a biased selection of the new material—a preference for that pertaining 
to mathematical bases of statistical ideas and procedures as against applications, 
except, it would seem, those that have come to the author’s personal attention. 
Contributions appearing in mathematical-statistical sources have been favored 
over those, for example, appearing in Psychometrika. 

In introducing the subject, the author has stressed the fact that statistics 
is a logic, a way of thinking rigorously about observations. Philosophical orien- 
tation of the initiate is regarded as very important. Chapter I is devoted pri- 
marily to this objective, with a discussion of the nature of statistics, the need for 
statistics, and their utility. Considerable attention is given to the fact that both- 
ers most teachers, that students entering the study of psychological or educa- 
tional statistics are woefully unprepared with respect to mathematical back- 
ground. Professor Kelley finds it necessary to present statistical theory and meth- 
ods from an advanced mathematical standpoint. In the first chapter is a “quickie” 
course on logarithms, graphs, equations, and significant figures, plus definitions 
of many concepts and symbols. It is very doubtful whether this will be at all 
adequate to the reading and comprehending of much of the later material for the 
student who has had only elementary algebra. A mathematical-background ex- 
amination is provided in an Appendix, with key and norms, for the assessment 
of the students’ mathematical status. The use to be made of this examination is 
left to the instructor or student. 

Chapter II is a highly logical classificatory treatment of types of statistics. 
The major categories are “temporal, spatial, qualitative, and quantitative,” and 
their sub-categories. In the two chapters following this, these types are given 
more concrete substance by reference to tabular and graphic treatments. These 
two chapters are excellent. Much useful advice is given on the wise preparation 
of tables for different purposes. Graphic methods are liberally illustrated, though 
the examples are somewhat restricted to a few types with some repetition, and 
they are often based upon data from fields other than psychological or educa- 
tional, e.g., weather data and economic data. Most of the volume seems aimed at 
the student of psychology and education, hence such exceptions would seem un- 
necessary. Novel devices are suggested to aid in the grouping of data so as to 
maximize the descriptive value of distributions. 

Departing from tradition, Kelley treats the subject of variability before that 
of central tendency. He gives reasons, but they may not justify the inversion to 
teachers who are accustomed to the opposite order. Whether the student will 
profit by the change remains to be seen. It would seem that, since the days of 
Herbart, beginning with what the student already partly knows is good psychol- 
ogy, though it may be poor logic. There is a quite novel introduction to the con- 
cept of variation and variability by reference to inter-pair differences. The re- 
viewer has found this conception to be a good pedagogical device in developing 
the idea of variance. The description of this conception in mathematical terms, 
however, and the derivation of a statistic “f’’ based upon those differences, must 
be not easy to teach. This statistic seems to have little promise of practical use, 
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and its discrimination from the statistic V (variance) must mean another dif- 
ficult hurdle for the student. It may add materially to the understanding of the 
more advanced student after the concept of variance is well established. Kelley’s 
emphasis upon the concept of variance and his relatively less mention of the 
standard deviation is in keeping with recent trends. The idea of variance is much 
harder for the student to grasp than that of variability, but it is probably wise 
to use the term much more than has been common, hoping that by repeated stimu- 
lation the student’s understanding of it will grow. As in his previous work, Kel- 
ley makes considerable mention of higher moments. The distinction between 
biased and unbiased estimates of population variance is clearly brought out. He 
proposes a new statistic descriptive of variability, namely, Pv, the range from 
the 7th to the 98rd centile. Its only noteworthy advantage is its relatively low 
standard error. It will probably not become very popular. 

In his discussion of central tendency, Kelley pays much more attention than 
is usually true to the estimation of a mode. He believes that genuinely skewed 
distributions are quite common in nature, due to restricting influences. Measures 
of kurtosis as well as of skewness are given unusual attention. There is an ex- 
cellent treatment, though brief, of the harmonic mean and its application. 

There are very thorough discussions of chi square, F’, and t, their mathe- 
matical bases, their interrelationships, and their applications, in principle, at 
least. Of the three, F is regarded as being most fundamental or general. Kelley 
is more liberal than many statisticians in tolerating the application of chi square 
and t when frequencies are small. 

Analysis of variance is introduced in connection with regression problems. 
There is a good integration of correlational and variance analysis of data. Con- 
siderable space is given to the correction of coefficients of correlation for errors 
of grouping, for attenuation, for shrinkage, and for other reasons. 

In the discussion of reliability of measures, the distinction between sampling 
errors and measurement errors is clearly brought out and some of the implica- 
tions are mentioned. The relations of test reliability to item theory and item sta- 
tistics are almost entirely neglected. A few of the consequences of reliability are 
mentioned, however, including the effect of errors of measurement upon classifi- 
cation of individuals, the weighting of components in a composite, the discrimina- 
tion of groups, and the effects of restriction of range. 

Chapter XIII is devoted to miscellaneous statistical topics and procedures, 
for example: tests of periodicity in time series; lead and lag in time series; curve 
fitting; optimal size of class intervals; the standard error of a coefficient of cor- 
relation corrected for range; and methods of extracting square roots and cube 
roots. Probably for the first time the newly-developed subject of sequential analy- 
sis comes in for extended treatment in a textbook on general statistics. 

Chapter XIV is another one on miscellaneous topics, more purely mathemati- 
cal than statistical. There are discussions of matrices and determinants and their 
applications; of basic factorials and the gamma function; of Lagrange multi- 
pliers; of the point binomial; and of the Fourier series. This chapter and the 
preceding one are extreme symptoms of somewhat inadequate organization and 
unity in other parts of the volume. On the one hand, topics are brought together 
that are rarely mentioned in other statistical textbooks. On the other hand, they 
represent procedures that are rarely used, and many of them are so briefly treated 
that the less sophisticated investigator will find them very difficult to apply. 

The general statistica] tables provided in a final chapter are limited to the 
Lagrangian interpolation coefficients (nine pages), the familiar Kelley-Wood ta- 
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bles of the normal-curve functions, and a four-page table of square roots and 
cube roots. The author provides, however, a very extensive bibliography, with 
brief comments on each one, of all the important published statistical tables. 
Other short and incidental tables will be found scattered among the various chap- 
ters. It is almost certain that the student and the investigator will sorely miss 
tables of chi square, t, and F. 

About 50 pages in Appendix B are devoted to (1) a list of statistical sym- 
bols and their definitions; (2) a glossary of mathematical terms; and (3) an in- 
dex to formulas given in the text. A 13-page highly selected bibliography and 
a 13-page general index complete the volume. The index is a vast improvement, 
with respect to completeness, as compared with that in Statistical Method. As 
with most statistical handbooks, however, many a user will wish the index were 
even more complete and detailed. 

In general evaluation, it may be said that the volume, like Statistical Method, 
will be a most useful and dependable handbook that every investigator who uses 
statistics at all extensively will want to have available. In one source can be 
found a larger variety of topics on “fundamentals of statistics” and their mathe- 
matical bases than in any book written for the same consumers. The investiga- 
tor, however, will find serious gaps. Analysis of variance receives limited treat- 
ment, and experimental design is merely mentioned. Factor analysis, though per- 
haps not generally regarded as a fundamental statistical procedure, is merely 
defined. The extreme brevity of treatment of many topics will leave the reader 
in a quandary as to how to apply many procedures. Occasional enigmatical state- 


ments will bother the student, particularly, for example, “.. . .any function of 
a series of measures is an average of them if it equals them in the special case 
when they all have the same value . . . ” (p. 234). This handbook will need 


considerable supplementation on the investigator’s workshelf. 

As a textbook, this volume is best suited to students who have had a prepara- 
tion for it in college mathematics and to courses in advanced statistics. The 
average beginning student without that background will probably sink rather 
than swim after proceeding beyond the fifth chapter. As was pointed out earlier 
in this review, there is much that is profitable to the student and not often rep- 
resented in other text books. This suggests a distinct value for the volume in 
the form of supplementary reading, selected to suit the readiness of the student 
to profit by it. 

The typography makes reading difficult. It is hoped that this edition, being 
apparently of the offset variety, will be sunerseded by a printed form. It is hoped 
that the printed form will also exhibit materially improved organization and more 
elaboration in many places. The excessive use of Roman numerals in number- 
ing tables and charts detracts from facile reading and appearance, e.g., Table 
VIII J and CHART VIII VIII. The volume is remarkably free from typographi- 
cal errors, in view of the enormous difficulties with this type of material. 


University of Southern California J. P. Guilford 


N. RASHEVSKY. Mathematical Biophysics. Revised edition. Chicago: Univer- 
sity of Chicago Press. Pp. 669. 
The revised version of Mathematical Biophysics appears a decade after the 
first edition, and a dozen years after the author founded at the University 
of Chicago a unique group of research workers in the field of mathematical biol- 
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ogy. It seems appropriate to take the occasion of reviewing this book as an op- 
portunity to evaluate as well as may be possible the work of the author and his 
colleagues. 

Rashevsky’s explicit aim is to develop a systematic and deductive mathe- 
matical theory of biological phenomena, with mathematical physics as model and 
inspiration and as a source of fundamental natural laws from which the laws of 
biology should be deducible. The results of this program (except for a remark- 
able series of studies in mathematical sociology, which appear in a book called 
Mathematical Theory of Human Relations) are here grouped into four sections, 
which deal with the properties of vegetative cells, with nerve excitation and con- 
duction, with the central nervous system, and with “The Organism as a Whole 
and the Organic World as a Whole.” The scope and variety of the material 
treated is evident from a sampling of chapter headings, which range from cell 
division, growth and mitosis, cancer, cell polarity, and organic form, through re- 
action times, psychophysical discrimination, and conditioned reflexes, to the ge- 
stalt problem, rational learning and thinking, visual aesthetics, and abstraction, 
and finaliy to the forms of plants and the locomotion of snakes (studies rem- 
iniscent of the interests of the late D’Arcy Thompson), culminating in a final 
chapter entitled “The Organic World as a Whole.” 

The material which is of most interest to the readers of this journal is in 
the second and third sections of the book, and particularly in the third. The 
treatment of nerve excitation, which is a development of a theory advanced by 
A. V. Hill, does not involve an explicit physical model or interpretation. Its basic 
concepts are “factors” of excitatory and inhibitory nature, which are otherwise 
carefully left unspecified, although some speculative remarks as to their prob- 
able character are included. These factors are assumed to obey certain simple 
and plausible differential equations, which in essence postulate that each factor 
increases in a neurone at a rate proportional to the intensity of the exciting 
stimulus, and dies away at a constant percentage rate. Neurones are classified 
in terms of their constants which appear in the equations; the classification may 
crudely be boiled down to “on the whole excitatory” and “on the whole inhibi- 
tory.” Such properties as rheobase, chronaxie, and the like are investigated. 
Various types of excitation, all electrical in nature (constant current, alternat- 
ing current, condenser discharge, and so on), are studied. Such phenomena as 
electrotonus are investigated with the aid of the theory. 

Conduction is conceived as purely electrical, involving excitation of regions 
adjacent to an excited region by bioelectric currents. The treatment is therefore 
essentially that required for a cable. Specific ion effects are not discussed, nor 
are humoral thecries of conduction and excitation. The latter are at present both 
incomplete and controversial, so that it is not possible to say whether they con- 
flict with Rashevsky’s theory or not. 

The first chapter of this secticn briefly discusses the problem of the nature 
of excitability and excitation, but does not get very far toward a solution. How- 
ever, the formal developments of the rest of the section are independent of this 
gap. The results seem on the whole promising, although there are defects and 
failures in detail, many of which the author explicitly notes. The reviewer does 
not feel competent to discuss some of the more refined points at issue in the de- 
tail which perhaps they deserve. 

However, the third section of the book subjects this material to an indirect 
yet fairly stringent test, the results of which suggest that some of the very tech- 
nical objections that have been raised against the theory are not so serious as 
they appear at first. Rashevsky’s treatment of the central nervous system, and 
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the problems of behavior is based largely upon the results of the preceding sec- 
tion. His attack upon the problem may be expressed rather simply in the follow- 
ing postulate: The neuroelements of the central nervous system follow substan- 
tially the same simple laws as peripheral neurenes; all the complexity of the 
functions of the central nervous system may be accounted for in terms of vari- 
ous complex combinations or networks of such simple elements. 

This assumption — that by putting together enough excitatory and inhibi- 
tory neuroelements with the right choices of their constants one can get circuits 
which will do virtually any trick one pleases — seems to be justified by the re- 
sults presented. It may of course nevertheless be wrong; but it is a bold, simple, 
and yet ingenious approach, and permits a consistent treatment of an enormous 
number of very fundamental psychological problems. Where it has been put to 
an adequate experimental test, it comes out rather well. Anyone familiar with 
the vagaries of curve-fitting is of course aware of some of the dangers te be en- 
countered; and there are certainly many figures in the book in which the data 
agree with the theory but obviously might agree with any of a dozen different 
theories. However, curves like those found on pages 393, 414, 416, or 503 (to 
take a few examples) do not look as if they could be obtained so readily, and the 
agreement of data and theory in such cases is quite impressive. One of the im- 
mense advantages of a rational and coherent theory as against a limited and 
ad hoc theory is often illustrated: since the constants of the theory are not ar- 
bitrary, independent checks are often possible. Page 474, for instance, exhibits 
a set of three experimental curves and the corresponding theoretical curves. The 
theory involves two constants, which were obtained from two points on one of 
the curves. The two remaining curves were drawn with the aid of these con- 
stants, and the fit to the data is as close as anyone could desire. 

The devices used in this theory are so interesting that it is a temptation to 
describe many of them in detail. In view of the limitations of space, we must 
restrict ourselves to some sketchy remarks on mechanisms and their results. The 
attack upon the problem of discrimination of intensities is a good example to 
begin with. In essence, the kind of circuit postulated is a set of parallel path- 
ways with inhibitory cross-connections. If the input of each pathway is stimu- 
lated with the same intensity (and the paths all have the same constants), mu- 
tual inhibition is complete, and no reaction results. But if one stimulus is larger 
than the others, it will inhibit the pathways initiated by these others whereas 
its own route will not necessarily be inhibited. A more subtle mechanism, based 
on the properties of the time curves of the excitatory and inhibitory “factors,” 
is used to obtain discrimination of relations, such as “greater than” or “less than.” 

Visual gestalt transposition is treated by means of a circuit for “scanning” 
visual contours, so that the contour determines a reaction irrespective of the 
color or shade of its interior region or its background. Higher-order circuits are 
superimposed upon this so as to get recognition of shape irrespective of size or 
orientation. 

Conditioned reflexes are treated with the aid of self-exciting neural circuits, 
which are placed in the pathway of the conditioned stimulus and receive colla- 
teral connections from the unconditioned stimulus. The conditioning experiment 
results in a total intensity high enough to excite these circuits; they then remain 
partly excited, so that they effectively lower the threshold in the path of the con- 
ditioned stimulus. A theory of learning is based upon the theory of conditioning; 
and it is interesting to note that this theory involves return circuits excited by 
responses during the learning process — in other words, mechanisms of the kind 
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which are called “feedback” by electronic engineers and which have received 
much stress in recent speculations concerning the central nervous system. A the- 
ory of rational learning is also developed, based on the idea of rational learning 
as a sort of mental trial-and-error leading to the recognition of familiar patterns 
as parts of unfamiliar ones. The theory is somewhat marred by the introduction 
of the notion of “spontaneous concentration of attention’ without any hint of a 
neural mechanism for accomplishing this. 

Chapter XLV is a fascinating but frankly speculative account of possible 
mechanisms for consciousness, memory, and a number of neurotic and psycho- 
pathic conditions. A weakness of this and of the chapter on visual aesthetics is 
their dependence on the highly unsubstantiated notion of a “pleasure center” and 
a “pain center.” The existence of such centers seems unlikely to receive much 
experimental] support. However, it seems likely that the theory could be revised 
in such a way as to include the ideas of pain and perhaps of pleasure without 
recourse to the device of localization. 

The chapter on the Boolean algebra of neural nets is an account of an im- 
portant and truly remarkable development, by which the methods of symbolic 
logic rather unexpectedly furnished a solution to one of the basic problems of the 
theory. In all the other work, networks were postulated and the actions of such 
networks determined deductively. This left untouched the following prob%em: 
Given a type of behavior, to determine deductively the corresponding neural net- 
work. It would take too much space to describe the method by which this prob- 
lem was solved, although in essence it is strikingly simple. A shortcoming of 
the account given here is that the algorithm for constructing networks from 
symbolic equations of behavior is not given explicitly but left to be inferred from 
illustrative examples. 

One defect of the whole section it shares with the classical stimulus-response 
psychology which it follows so closely. It has virtually nothing to say about the 
kind of partly spontaneous reflex behavior which has been termed “operant” by 
Skinner. However, the chapter on learning seems to contain the germ of an ade- 
quate theory of operant behavior, so that this defect is probably not intrinsic. 

Any enterprise of such ambitious scope as that reported in this book must 
of necessity show a certain amount of unevenness. The reader has gathered from 
the preceding paragraphs that the treatment of neurophysiological and psycho- 
logical problems, though doubtless subject eventually to considerable revision, 
forms an impressive and useful contribution. The material on the biophysics of 
the cell, however, is in the opinion of the reviewer more limited. The leading 
idea used here is that gradients in the concentrations of dissolved substances re- 
sult in gradients of diffusion pressures, and so in mechanical forces on the parts 
of the cell. Cell division, cell movements, and mitosis are then described in terms 
of these forces. Since an important class of solutes consists of the compounds 
involved in carbohydrate metabolism, and since gradients of these compounds are 
maintained by the metabolic reactions into which they enter, a relation is thus 
established between metabolism and the mechanical phenomena. An attempt is 
made to derive the relations found by Warburg between a high aerobic glycoly- 
sis and rapid rate of cell division, as in cancers. The idea is rather neat, and 
some of the applications are probably valid. But one leaves this section with a 
feeling of disappointment because it omits many topics which practically all 
contemporary biologists would agree are vital. 

It seems strange, in 272 pages devoted to the application of physical theory 
and mathematics to the problems of cell biology, to find virtually no mention of 
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such exciting topics as the nature of self-duplicating units and the mechanism 
of self-duplication, or the relation between catabolic energy and its application 
to synthetic processes, or the metabolic significance of intracellular structure. 
One could construct a list of topics slighted, all of which seem especially to in- 
vite the kind of treatment at which Rashevsky is adept. 

His program, however, remains incontestably sound and attractive, and bi- 
ologists, whatever reservations they may care to make, must acknowledge a con- 
siderable indebtedness to Rashevsky and his co-workers. 

John M. Reiner 


ROBERT L. THORNDIKE. Personnel Selection: Test and Measurement Tech- 
niques. New York: John Wiley & Sons, 1949. Pp. viii + 358. $4.00. 


Despite the advances in psychological measurement and statistical technique 
displayed in the pages of Psychometrika and similar journals over the past ten 
years, teachers of courses in this area of knowledge can justly despair over 
the lack of practical, comprehensive, and scholarly texts. A number of recent 
textbooks in psychological statistics have included fairly thorough treatments of 
the theory of psychological measurement and of correlational prediction prob- 
lems, but have seldom discussed the difficulties of applying this theoretical knowl- 
edge to practical situations in large-scale personnel research. On the other hand, 
textbooks on psychological testing and personnel selection have been devoted 
largely to descriptions of kinds of tests and reviews of research results in select- 
ing personnel for various types of jobs, with usually only an elementary discus- 
sion of psychometric techniques. | 

Thorndike’s book seems to go a long way towards filling the need for a text 
which concerns itself with the practical applications of psychometric and sta- 
tistical techniques in personnel selection programs. The book was bound to be 
practical, for it is an outgrowth of the author’s four years of experience in the 
Army Air Force psychological research program during the last World War. As 
a matter of fact, the first eight chapters of the book are re-written from Thorn- 
dike’s earlier publication, Research Report No. 3: Research Problems and Tech- 
niques (U. S. Govt. Printing Office, 1947). The two major (and related) faults 
of the book are these: (1) it does not go far enough; and (2) despite strenuous 
efforts on the part of the author to expand his earlier material so as to be appli- 
cable to personnel research in a civilian setting, the organization and content re- 
tain too many vestiges of the particular problems encountered and the solutions 
tried out in the Air Force Program. It may be that a book which would satisfy 
the reviewer in this regard would be a much longer book than the present one; 
nevertheless, the reviewer feels it incumbent on him to evaluate the book in terms 
of his conception of what should be encompassed in a book with a title such as 
this one has. 

As stated above, the first eight chapters of the book are adapted from Thorn- 
dike’s AAF report. Many of these chapters are almost completely parallel to 
their counterparts in the earlier report, although nearly every other sentence 
has been re-written, and a considerable amount of material has been inserted to 
draw attention to civilian applications. The remainder of the book (Chapters 
9-11) represents completely new material on the administrative problems inci- 
dent to large-scale personnel research programs (for example, gaining accept- 
ance for a testing program, conduct of testing, organization of records, adminis- 
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trative use of test results). This review will follow the natural division of the 
book into two parts. 

The author states in his preface that for several years he has used the ma- 
terial of the book in a course in the theory of measurement. In the hands of an 
instructor who could provide additional systematic and illustrative material, the 
first eight chapters would undoubtedly be useful for such a course. In fact, the 
treatment of certain topics (for example, test reliability and validity) is quite 
extensive, and the presentation is with certain exceptions adequately clear for a 
student with prior training in statistics. (The author indicates that about a year 
of previous statistical training is assumed in the reader.) In general, the ma- 
terial presented is sound, or at least in line with currently accepted doctrine on 
test theory and statistical techniques. There are a number of gaps in the treat- 
ment, however, many of which were probably occasioned by the fact that in the 
AAF program certain approaches were emphasized to the exclusion of others, 
by the very nature of the problems encountered there. On the other hand, one 
basic problem which received extensive consideration in the AAF program is 
passed over here with only scant attention. 

This problem, which in the opinion of many is central to the development of 
personnel tests, is that of the rationale of individual differences and of the traits 
or complexes measured by psychological tests. Thorndike pays iip service to this 
problem early in the book, for example in the chapter on Job Analysis, where he 
recommends that we must “have a sound set of categories in terms of which to 
describe qualities of behavior,” and that we must “show sagacity in identifying 
those categories in the job description” (p. 16). He then goes on to state that 
the set of categories should be comprehensive, organized, and systematic, and 
that the categories should be independent, psychologically meaningful, and sug- 
gestive of testing operations for their measurement. He also presents (pp. 17f.) 
a suggested outline of job analysis categories, an enumeration which, inciden- 
tally, omits the broad area of job skills and knowledges except for a category 
called “academic skill requirements.” Later (p. 37), in discussing Test Selec- 
tion and Invention he contrasts the job approach with the trait approach, the lat- 
ter based on “the genera! qualities of the individual rather than on the charac- 
teristics of a specific job.” At this point, he fails to take advantage of the oppor- 
tunity to review at some length the current knowledge of the various domains 
of individual differences. Instead, he only mentions in passing “such functions 
as numerical facility, verbal fluency, and perceptual speed.” (p. 38) Later still 
(p. 50), the following sentences appear in a paragraph headed “Original con- 
ception of a test idea’’: 


A wide knowledge of existing tests provides the background for 
fruitful new combinations of testing materials and procedures. The test 
inventor should, therefore, become intimately acquainted with the job or 
jobs for which the tests are being developed. He should also have a wide 
acquaintance with existing test forms and with the literature on the de- 
termination of distinct traits or dimensions of human performance. 


But beyond a footnote reference (p. 40 fn.) to AAF reports on classification 
tests the reader is given no information on how to acquire a knowledge of psycho- 
logical research results in this area. No illustration of the development of a “test 
idea” is presented. The factorial approach is discussed, in a page or two of gen- 
eralized treatment, only once again in the remainder of the book (pp. 218-220)— 
despite the great attention it received in the Army Air Force program. 

This lack of attention to the nature of test continua gets Thorndike into dif- 
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ficulties in the exposition of certain phases of test theory. For example, in the 
discussion of item analysis by internal consistency, the reader will probably not 
fully understand the idea that items in a test should “contribute to the measure- 
ment of a single homogeneous function” (p. 232). The concept of a test score as 
a composite measure of a wide sampling of specific knowledge is not brought out, 
as it might well be on page 258 in the discussion of internal consistency analysis 
of a survey test of knowledge of the physical sciences. While the distinction be- 
tween aptitude and achievement tests may in fact be arbitrary and artificial, 
there is no discussion of the distinction, and the terms do not even appear in the 
index. 

The long chapter on the estimation of test reliability is one of the best that 
this reviewer has seen, from a didactic point of view. It draws a clear distinc- 
tion between absolute and relative measures of reliability and develops the theory 
of test reliability from a consideration of the sources of variance in test scores. 
Thorndike’s classification of sources of variance will seem unfortunate to many, 
however, and he fails to capitalize on a number of recent advances in reliability 
theory, for example Wherry and Gaylord’s work on the effect of the factorial 
composition of a test on the estimation of its reliability. 

It is in the chapter fn criteria of proficiency that the specific influence of 
the AAF program is most clearly evident. Thorndike emphasizes that criteria 
must be selected on rational grounds, a point of view with which most personnel 
psychologists are grudgingly forced to agree. But this reviewer finds the descrip- 
tion of “rational grounds” for selecting criteria couched in terms which do not 
positively lead to any heuristic procedures. Suppose we say, as Thorndike does 
(p. 125), that “a criterion measure is relevant as far as the knowledges, skills, 
and basic aptitudes required for success on it are the same as those required for 
performance of the ultimate task.’”? Would this definition help us to determine 
how closely a proximate criterion is related to an ultimate criterion, or to deter- 
mine whether perchance we have an ultimate criterion at hand? The line of rea- 
soning selected by Thorndike seems to stem from the fact that by the nature of 
the circumstances the AAF program happened to involve an undue emphasis on 
proximate criteria such as training school success. The above-cited definition of 
criterion relevance crumbles to nothing when we begin to evaluate a criterion 
such as a worker’s production record or a supervisor’s rating, for it requires us 
not only to conceptualize the ultimate criterion but also to educe the basic skills, 
knowledges, and aptitudes required in both the proximate and the ultimate cri- 
terion. Thorndike pays very little attention to a type of criterion which has been 
widely used in military personnel research programs, namely, the criterion ob- 
tained by ratings or nominations by a number of associates. Whatever the vir- 
tues or faults of this procedure, Thorndike does not do it justice. In discussing 
“summary ratings” as he calls them, he correctly points out that they are sub- 
ject to varying standards of rating and to biases due to the use of such ratings 
for administrative actions. It is true that these problems are serious in connec- 
tion with official ratings such as those in a merit system, but they are minimized 
when the ratings are obtained only for research purposes. 

The chapter on criterion measures is also weak in that it does little to sug- 
gest the possibilities of research on the criterion itself, beyond the determination 
of its reliability. The impression is gained that the primary problem in this area 
is finding criteria, rather than the development and detailed analysis of criteria. 
For example, Thorndike could have suggested the factorial approach and could 
have emphasized more fully the problems in investigating the consistency of cri- 
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terion measures over long periods of time. He could have pointed out, both here 
and in the chapter on job analysis, that it is more profitable to gain an idea of 
the traits required for success in a job through detailed descriptions of success- 
ful and unsuccessful behavior than through direct questioning as to what traits 
make for success. One other weakness of the chapter is that it fails to discuss a 
problem often encountered in personnel research, viz. that the criterion does not 
“stay put,’—but changes in character from time to time. 

The chapter on the estimation of test validity presents an excellent discus- 
sion of the correlation measures to be recommended when one or the other of the 
variables is dichotomized, as frequently happens in personnel research. There is 
also an extensive treatment of the correction of correlations for restriction in 
range. The conventional product-moment correlation coefficient is mentioned only 
briefly, it being assumed that the reader will have had sufficient background to 
recognize this as the primary correlation technique in most situations. The reader 
will indeed need considerable sophistication in matrix algebra to comprehend a 
technical note (p. 176 fn.) on corrections to be applied when the sample has been 
curtailed on two or more variables. This material, incidentally, appears not to 
have been published elsewhere. 

Chapter 7, “Combining Tests into a Battery,” begins with a sketch of the 
derivation of multiple correlation formulas and continues with a discussion of 
topics such as factors determining multiple correlation, -non-linear approaches 
including the multiple-cutoff method, the optimal size of a test battery, and the 
“combination of data from partial criteria” (i.e., when several criteria are avail- 
able on the same group or when the criteria differ from group to group). There 
is an interesting discussion of the use of test batteries for multiple selection and 
for classification, with a mathematical solution to the problem of the optimal 
classification of individuals into two job categories. 

Chapter 8 gives an elementary treatment of item analysis procedures cover- 
ing item difficulty, item discrimination, and the selection of items for a test. The 
analytical procedures suggested are conventional; mathematical formulations are 
conspicuously absent in this chapter as compared with some other chapters. 

The technical presentation in the first eight chapters, while it may satisfy 
certain standards of excellence, is only rarely thoroughgoing. Except possibly in 
the chapter on reliability, it somehow fails to get down to the basic theory of 
measurement; the student would come away from these chapters mainly with 
a bag of tricks rather than any theoretical comprehension of psychometric tech- 
niques. It is hard to see how this part of the book, by itself, would form an ade- 
quate framework for a course in the theory of measurement. Moreover, the treat- 
ment of certain statistical matters is loose: for example, in the derivation of 
multiple regression formulas (p. 189) the symbolism used for what Thorndike 
calls scores in raw score units should be identified as referring to raw score 
deviation units. Similarly, it should have been specified on p. 224 that the sym- 
bols, A, a, B, and @ refer to scores expressed in standard score units. On this 
same page, carelessness is evident in the fact that formula (16) fails to contain 
2 as a factor in the denominator of the fraction. On page 105, formula (17) does 
not contain an equality sign. On page 109, in the discussion below formula (19), 
the statement that “the validity increases only as a function of the square root 
of the reliability” is inaccurate, or at least misleading, because it misses the point 
that the extent to which a test gains in validity by lengthening is related in- 
versely to the initial reliability of the test. Things like these might harass a stu- 
dent who is not too well-grounded in statistics. Nevertheless, since the book was 
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not intended as a statistics text, minor inaccuracies may perhaps be excused. 

The last three chapters are concerned with what the author calls adminis- 
trative problems in conducting a personnel research program. These chapters 
are obviously the distillation of Thorndike’s experience in managing the adminis- 
trative details of a tremendously complicated and far-flung research enterprise 
and in dealing with a rather tough-minded sponsor. The lessons to be learned 
from this presentation will serve well an industrial or military psychologist faced 
with the problem of organizing and operating even a fair-sized research program. 

Chapter 9 covers administrative problems in the conduct and scheduling of 
testing, the computation of test scores and the reporting of results, and the or- 
ganization of records. Military psychologists were not well-prepared to handle 
some of these administrative problems effectively at the outset of the last war; 
many of their mistakes might never have occurred if the material which Thorn- 
dike presents here had been available. The section on the organization of re- 
ports does not go far enough in suggesting systematic ways of handling research 
data; it is concerned largely with problems of maintaining records of individual 
test scores and allied identification material. It would have been more helpful 
if it had discussed, in addition, problems of developing and filing research plans 
and reports, the identification and coding of variables, the clear description of 
research, samples in the light of administrative changes in personnel and training 
procedures, and the organization of various ancillary records. 

Chapter 10 is entitled “Administrative Problems in Using the Results of an 
Aptitude Testing Program.” Among other things, it presents, perhaps for the 
first time, a cle«+ statement of the dilemma into which military psychologists 
were forced duzing the war — whether to classify the incoming stream of re- 
cruits according to a “daily quota” scheme, thus satisfying the immediate demands 
of the organization for “bodies,” or to classify them according to a “predicted 
yield” scheme which would afford greater opportunity to match the individual to 
the job. Thorndike presents the advantages and disadvantages of both schemes, 
but inclines to favor the “predicted yield” pattern. The principles involved here 
should be brought forcefully to the attention of military planners. During the 
last war, the “daily quota” scheme was the rule, although the British Army was 
able to approach a “predicted yield’ system, probably because the smaller size 
of the country made a centrally operated classification agency more feasible than 
in the United States. Of course, one may speculate as to whether this problem 
is: of much importance outside the military situation. An industrial organization 
would have to be very large, and personnel turnover would have to be very fast, 
before the problem of “predicted yield’ vs. “daily quota” would present itself. 

The final chapter, “The Personnel Selection Program and the Public,” con- 
tains an extensive treatment of various graphic methods for presenting and ex- 
plaining research results to the sponsor of the research. In addition there are 
suggestions as to the “promotion of new personnel projects,” and the relationship 
between the personnel psychologist and the sponsor. Some exception may be 
taken to the tone of these remarks. Thorndike displays an attitude at many places 
throughout the book which gives the impression that the relationship between the 
personnel psychologist and the sponsor is delicate and that the psychologist must 
pay a great deal of attention to “promoting” himself and his work. “The psy- 
chologist proposes, but top management disposes” (p. 259). “It is of critical im- 
portance to sell the program to those members of top management who have 
powers of life or death over the program” (p. 312). Now, it is manifestly true 
that personnel research has not always been well received or understood by man- 
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agement, and that the successful operation of many personnel research programs 
has been possible only through a considerable degree of aggressiveness on the 
part of professional personnel. Nevertheless, it is not a characteristic of a ma- 
ture profession that it talks about “promoting” or “selling” its product. The ac- 
tivity in which a personnel research organization engages when it proposes new 
programs or makes recommendations based on research results is, rather, in the 
nature of education and professional consultation. Both management and the 
profession of personnel research will stand to benefit more, in the long run, if 
an attitude of confidence and of a sincere desire to serve is adopted by personnel 
research workers. Since this attitude is, perhaps inadvertently, inadequately rep- 
resented by the present book, it cannot be recommended as a book to be placed in 
the hands of management personnel. And though in other respects it will be 
highly useful for professional persons in charge of personnel research organiza- 
tions, the somewhat Machiavellian flavor of the material on professional problems 
will detract from its ultimate utility. 

Thorndike’s book is provocative. His treatment of the classification problem 
charts a new path in personnel research, for example. As a result of the discus- 
sion of the problems in obtaining criteria and in validating tests, this reviewer 
has been stimulated to think anew along these lines: “Personnel research is slow. 
Even the initial development of a test battery takes time, and we might wait 
years for adequate criteria to become available. And by. that time the criteria 
might be no longer relevant to what we wanted to predict when we started. In 
military psychology, at least, we are always about one war behind — we could do 
a swell job on the last war if we could start over with what we know now. How 
can we break out of this difficulty?” 


Harvard University John B. Carroll 
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